Introduction

The ongoing coronavirus disease 2019 (COVID-19) pandemic, caused by severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), presents serious public health threats, with 5.5 million people deaths worldwide and 272 million reported cases, as of December 2021. Since 2019, when the pandemic began, pneumonia, acute respiratory distress syndrome, pulmonary thrombosis, clinical phenomena, and management strategies have been broadly investigated1,2,3,4,5. Reducing the disease fatality and incidence remains challenging despite increasing vaccination rates due to uncertainty from recently identified COVID-19 variants6. The Omicron B.1.1.529 variant (hereinafter “Omicron”) was identified by the Technical Advisory Group on SARS-CoV-2 Evolution, and the World Health Organization considers it a variant of concern7.

Despite its recency, studies on the transmissibility and fatality of Omicron have been conducted8,9,10,11. Omicron has a shorter incubation period and similar or milder clinical symptoms than previous variants12. It has a lower fatality rate but a higher transmission power than the existing variants. Most patients with this variant have mild symptoms and a good prognosis; however, some progress rapidly to serious conditions, including septic shock and multiple organ and acute respiratory failure13. The COVID-19 pandemic has induced difficulties for the global healthcare system, particularly insufficient hospital beds, medical equipment shortages, and medical and nursing staff shortages14. Since March 2020, the Korean government has operated a Residential Treatment Center (RTC), in addition to existing medical services, to quarantine, regularly screen, and monitor asymptomatic and mild COVID-19 patients15. Omicron has a lower fatality rate; therefore, the number of mild patients is expected to increase, and patient surveillance is consequently expected to become more important in future pandemics.

Many approaches to assessing the risks of COVID-19 existed, even before the Omicron outbreak. Kostakis et al. used the National Early Warning Score (NEWS) 1 and 2, widely used in emergency medicine, to predict mortality and intensive care unit admission in patients with COVID-1916. The NEWS has been widely used in patients with COVID-19, including for predicting in-hospital mortality17,18, predicting poor outcomes after hospitalization19, and identifying deterioration20,21. Other emergency score systems, including the rapid emergency medicine score (REMS)22,23 and quick sepsis-related organ failure assessment (qSOFA)24,25, have also been used to predict mortality. Liu et al.26 compared these emergency score systems in terms of effectiveness for risk prediction in COVID-19 patients. There have been attempts for early detection of COVID-19 using mathematical modeling27, knowledge based expert system28, deep learning29,30, and cloud-based image analysis31. However, most studies have focused on determining the current patient severity and mortality, and few predicted changes in the patients’ condition. There was even less research focused on mild COVID-19 patients.

We propose a novel scoring system for patients with mild or asymptomatic COVID-19 to predict rapid deterioration and requirement for transfer to tertiary hospitals. As in previously proposed scoring systems, the values measured on the day of admission were used. Our proposed scoring system was compared with previous emergency scoring systems. Here, we attempted to identify patients with mild or asymptomatic COVID-19 who needed special care because of the high likelihood that their condition could worsen rapidly. Although our study covered a single racial or regional group, factors for society and variants were differed between the cohort because there was a difference in the collection period and virus mutation occurred rapidly. The data were collected during a period of strong social distancing policies, and this study was based on community transmission not in-hospital infection and intra-facility infection. In our knowledge this is the inaugural study to employ a scoring system for predicting hospital admission requirements among patients with mild COVID-19. The proposed system will assist governments in devising differentiated strategies for high- and low-risk cohorts, thereby optimizing the management of limited healthcare resources and mitigating patient anxiety.

Results

Demographics and clinical characteristics

In the derivation cohort, the average ages of the transfer and non-transfer patients were 50.9 and 38.8 years, respectively. The average temperature of transfer patients was 37 °C, which is higher than normal. All continuous variables except respiratory rate showed significant differences between the transfer and non-transfer groups. Among the categorical variables, only hypertension showed a significant between-group difference. Patients who were transferred were older, had lower SpO2, a high pulse rate, and showed higher levels of SBP, DBP, fever, and hypertension. There was no significant difference between SBP and DBP in the external validation cohort, unlike in the derivation cohort; however, there was a significant difference in diabetes and hypertension in the external validation cohort. The triage center’s patient demographics and clinical characteristics are summarized in Table 1.

Table 1 Demographics and clinical characteristics of patients in derivation and external validation cohorts.

Clinical indicators associated with the transfer

The simple and multiple logistic regression results using the factors associated with the transfer are shown in Table 2. In the simple logistic regression analysis, most parameters except male sex, liver disease, kidney disease, and organ transplant were significantly associated with transfer (p < 0.01). Of the ten variables included, the four candidate variables of age, pulse rate, SpO2, and temperature were significantly related to transfer through multivariable regression analysis.

Table 2 Logistic regression analyses to determine each parameter’s association with the transfer.

Predictive performance of our scoring system

Each scoring system’s discriminatory ability for predicting transfer was compared using receiver operating characteristic (ROC) curves (Fig. 1A). Our scoring system had an AUC of 0.868, which was higher than that for the NEWS, REMS, and qSOFA (0.646, 0.612, and 0.509, respectively; (Fig. 1B)). The optimized thresholds were determined using the ROC curve as a trade-off between sensitivity and specificity to evaluate the prediction’s accuracy. Our scoring system had the best predictive performance for identifying whether patients were transferred when the threshold was 4.

Fig. 1
figure 1

Comparison of the performance of National Early Score (NEWS), Rapid Emergency Medicine Score (REMS), quick Sepsis-related Organ Failure Assessment (qSOFA), and our scoring system in predicting transfer. (A) Receiver operating characteristic (ROC) curves of four methods for prediction of transfer. (B) Performance was compared by calculating area under the ROC curve (AUC) for NEWS, REMS, qSOFA, and the proposed scoring system with DeLong statistic test. The asterisk (*) indicates a p-value lower than 0.05, and double asterisk (**) indicates a p-value lower than 0.01.

The accuracy, sensitivity, specificity, positive predictive value, and negative predictive value at the optimized threshold values for each model are shown in Table 3. Our system showed the best performance in terms of accuracy, followed by the qSOFA, NEWS, and REMS. Our system showed the best performance in all indicators except for sensitivity. Concerning the REMS, which had the highest sensitivity, the accuracy was < 50%, indicating poor performance in predicting negative cases. Although the positive predictive values were relatively low in all systems owing to data imbalance, our system had a value of 0.356, which showed a higher value than NEWS.

Table 3 Performance comparison of the ability of four different scoring systems in the optimal cut-off.

External validation

The predictive performance of applying our scoring system with significant variables in the derivation to the external validation cohort was compared with that of other models, as shown in Table 4. The external validation cohort had a more severe data imbalance problem than the derivation cohort did; therefore, the overall predictive performance, especially the sensitivity and positive predictive value, was low. All models except our system had AUC values < 0.6 and positive predictive values < 0.1. Our system showed a relatively stable performance for all indicators except for sensitivity. Sensitivity and specificity based on different threshold are shown in Supplementary Table 1. Regarding accuracy, all systems showed higher performance than in the derivation cohort, possibly resulting from a higher proportion of negative cases in this cohort.

Table 4 Performance comparison of the four scoring systems with external validation cohort.

Discussion

Identifying mild COVID-19 is essential for effectively controlling the pandemic, owing to its wide prevalence, which was 80% in a study of 72,314 patients32. As COVID-19 is often mild during diagnosis, preemptively responding to worsening of mild disease is important32. Additionally, since most patients with mild symptoms are recommended to isolate at home, or they are managed in RTCs with insufficient medical facilities and medical staff33, it is difficult to respond quickly to serious conditions. Therefore, it is necessary to enable easy screening of those with a high probability of deterioration. Development of a scoring system that can be easily measured using simple vital signs and clinical indicators is important. We have proposed a novel scoring system that could help predict whether patients with mild COVID-19 will be transferred based on their characteristics and clinical parameters, collected on the day of admission. Our scoring system will enable proactive determination of whether a patient requires close observation. There exist several scoring systems for the early prediction of patients requiring emergency treatment34,35,36. However, we believe this is the first study to use a scoring system to predict the need for hospital admission among patients with mild COVID-19. This will help governments to formulate different strategies for high- and low-risk groups, which will allow better management of the limited health care resources, and will be useful for decreasing anxiety in patients.

The variables used in our system included age, pulse rate, SpO2, systolic and diastolic blood pressures, temperature, and hypertension. Compared with previous systems, the only additional variables were DBP and hypertension. Regarding age, an important variable in the other systems, our system suggests that classification at a lower age is necessary (threshold of our system = 39, REMS = 45). Because most patients had mild disease, we needed to set the threshold SpO2 value relatively high compared with that suggested by other systems (threshold of our system = 98, REMS = 89, and NEWS = 96). Our scoring system has lower pulse rate and temperature thresholds for scoring the results as abnormal than other scoring systems, possibly because existing systems focus on distinguishing severely ill patients. Our results confirm that the system proposed a cut-off for a sensitive variable for determining milder patients’ conditions.

Consequently, our system had superior performance in predicting whether the patient would be transferred compared with the previous systems. It showed a statistically significant improvement in AUC compared with previous systems, and high values in almost all performance evaluation parameters. However, data bias may have occurred because the occurrence frequency was low, and the transfer of mild patients with COVID-19 to the hospital, determined by the subjective judgment of the medical staff, was considered as an outcome. More sensitive staff may decide to transfer patients to the hospital before their condition becomes serious. This would have had a greater effect on the transfer group, which had a smaller sample size, than on the non-transfer group. To exclude the effects of these biases, we verified our system in an external validation cohort from another institution and confirmed that it still showed promising results (Table 4). Furthermore, all patients tested had indications for the test; therefore, we consider that there was no referral bias for most subjects14.

According to the transfer prediction results of previous systems, the classification performance was in the following order: NEWS, REMS, and qSOFA. The NEWS is a clinical evaluation tool developed by the Royal College of Physicians in the UK, and can be calculated using the respiratory rate, SpO2, temperature, SBP, heart rate, and consciousness level34. The NEWS showed better mortality predictive power in patients with severe COVID-19 than other scoring systems26. In one study26, the authors reported that NEWS2’s predictive performance37, which added two SpO2 scales, was as good as that of the NEWS; however, we did not compare it with our system because the added SpO2 scale was not useful when analyzing patients with mild COVID-19. Our study confirmed that the existing risk prediction scoring systems, including NEWS, are somewhat useful in predicting the deterioration in mild COVID-19 patients at an early stage. However, the proposed system performed significantly better than the existing systems because it was developed with sensitive threshold adjustments to predict deterioration in patients with mild COVID-19 different with the existing systems. Recent studies have suggested applying the bacterial co-infection, hypo-lymphocytosis, multilobular infiltration, and the MuLBSTA score for predicting COVID-19 infection severity38,39,40. However, because our system uses only relatively easy-to-measure variables, the evaluation can be performed quickly and easily. Additionally, it can be performed even by non-professionals; therefore, it can be applied in poor environments where there are few medical staff with special training.

Despite using only variables that are relatively easy to measure, our scoring system has better performance in predicting the worsening condition of mild COVID-19 patients than other indicators, and has shown strong predictive power in external validation, so it can be applied in a variety of ways in actual clinical environments. It's possible. First, medical institutions that diagnose COVID-19 can apply a scoring system to confirmed patients to indicate whether the patient's condition is expected to worsen. Patients whose condition is expected to worsen require closer observation of their condition. Patients whose condition is not expected to worsen may only be considered for isolation to prevent transmission. In addition, by applying the developing wearable device technology, it is possible to predict the deterioration of the patient's condition in real time by fixing age and hypertension, which are variables that do not easily change, and measuring only the remaining four indicators with a wearable device. Nevertheless, this study has some limitations. First, it was retrospective, and all data were collected as part of usual care and not for research. Therefore, it was impossible to use new variables to predict which patients with mild COVID-19 could worsen.

In conclusion, this study proposed a novel scoring system was proposed for predicting patients with mild COVID-19 who will experience deterioration using simple measurable variables. The proposed system achieved 0.868 in derivation dataset and 0.899 accuracy in external validation dataset, outperforming existing scoring system that evaluated patient severity, such as NEWS, REMS, and qSOFA. This system can be used as effective tool for early screening of deterioration in mild COVID-19 patients. In particular, it can help to efficiently manage limited infrastructure in areas lacking medical infrastructure where the number of confirmed patients is explosively increasing. In the future studies, effectiveness of the proposed system could be confirmed for early screening of patients who are expected to deteriorate in medical resource management.

Methods

Sample and data

This study retrospectively utilized data generated from patients who received treatment at the RTCs operated by Seoul National University Hospital (SNUH). All patients who admitted at the RTC had been diagnosed with COVID-19, with laboratory confirmation using real-time reverse transcription-polymerase chain reaction testing in local health centers. Upon diagnosis, a group of public health experts triaged the patients and determined the treatment policy based on symptom severity. Patients with severe symptoms were hospitalized in a negative-pressure isolation room, while mild and asymptomatic patients were admitted to the RTC. The government entrusted several hospitals with the operation of RTCs to manage patients with mild COVID-19. The RTC is not a hospital facility and was originally operated with its existing accommodations41. Using the SNUH information and communication technology–based remote patient monitoring system, all patients admitted to the RTC operated by SNUH used a mobile app to self-report their past medical history and subjective acute COVID-19 symptoms. Patients admitted to the RTC received non-face-to-face treatment from medical staff at least once daily and were transferred to a local hospital if their condition worsened.

From March 6, 2020, to January 12, 2021, during the early stages of the COVID-19 pandemic, RTCs were operated in Mungyeong and Seongnam regions. The data from patients admitted during this period were used as the derivation cohort. From July 15, 2021, to January 12, 2022, the RTC in Seongnam was reopened, and the data from patients admitted during this period were used as the external validation cohort. This period coincided with the prevalence of mutant viruses. The alpha mutation detection rate was on decline from about 13%, and the delta mutation detection rate increased from 33 to 100% and then decreased to 73%. The Omicron variant began to be detected in December, 2021 and were 26% by the end of recruitment. Over time, the percentage of severe patients had declined. The transfer rate in the external validation cohort was reduced from 8.6% to 3.6%, compared to the derivation cohort. Reports were made once on admission and discharge, and twice a day during the quarantine period. The data recorded from patients were stored in the hospital information system in SNUH. These electronic health records were extracted using a clinical data warehouse at the SNUH, the SNUH Patient Research Environment. Through the health records left by the medical staffs, fifteen variables were collected: sex, age, pre-existing conditions (diabetes, hypertension, cardiovascular disease, respiratory disease, liver disease, kidney disease, organ transplant), and vital signs (systolic blood pressure (SBP), diastolic blood pressure (DBP), heart rate (HR), body temperature, and oxygen saturation (SpO2)). Blood pressure was measured using the Blood Pressure Monitor (OMRON Healthcare, Kyoto, Japan), and body temperature was measured using the Axillary Thermometer (TERUMO Medical Corporation, Tokyo, Japan), and oxygen saturation was measured using the Pulse Oximeter Monitor (OMRON Healthcare, Kyoto, Japan). Furthermore, to enhance the accuracy of patient vital sign measurements, patients were instructed to watch educational videos on the method of vital sign measurement by skilled nurses upon admission, allowing them to measure their vital signs. Each time vital signs were measured, the responsible nurse double-checked to ensure proper measurement and entry of values. All collected variables were measured at the time of admission the RTC. The (primary) outcome was an all-cause transfer to the hospital while staying at the center.

Parameters and measures of parameters

Using logistic regression analysis, a scoring system for transfer prediction was developed using selected variables known to be statistically related to COVID-19 severity. Variables for which there were significant differences between the demographics and clinical characteristics of transfer and non-transfer patients were selected. A scoring system was developed using seven variables (age, pulse rate, SpO2, SBP, DBP, temperature, and hypertension) with significant differences in the transfer group and not transfer group (Table 1). Among these, four variables, age, pulse rate, SpO2, and temperature, had high odds ratio and significant p-values in multivariate logistic regression, so they were considered more important variables and were allocated 2 points and 1 point to each of the remaining 3 variables. A score ranging from 0 to 11 was allocated to each patient by adding the points. The cut-off value for each variable was determined by comparing the transfer and non-transfer patients’ characteristics. Patients aged > 39 years with a pulse rate > 86 bpm and SpO2 < 98% were assigned a score of 2; those with an SBP > 118 mmHg and DBP < 77 mmHg were assigned a score of 1; those with a temperature level > 37 °C were assigned a score of 2; and hypertensive patients were assigned a score of 1. Each threshold was set at the point where the difference between the transfer group and not transfer group was greatest. Finally, a scoring system for detecting transfer and non-transfer patients was designed (from 0 to 11) and each patient’s score was calculated. Patients with scores of 0–4 were classified as non-transfer, and scores of 5–11 as transfer (Table 5).

Table 5 Scoring system for predicting transfer to a hospital of a patient with COVID-19.

Proposed scoring systems and data analysis

Categorical variables were presented as numbers (percentages) and analyzed using Pearson’s chi-square test. Fisher’s exact test was used when the expected frequency was < 5 because the chi-square approximation might not hold for the relatively small sample size. Continuous variables were presented as means ± standard deviations. Since they were non-regularly distributed data, they were compared using the Mann–Whitney U test. Simple and multivariable logistic regression analyses were used to evaluate candidates for constituent variables in the scoring system for predicting patient transfer. In the logistic regression analysis, R-squared was calculated by McFadden’s Pseudo R-squared.The area under the receiver operating curve (AUC), accuracy, sensitivity, specificity, positive predictive value, and negative predictive value were compared to evaluate the discriminatory power of our scoring system, as well as that of the NEWS, REMS, and qSOFA, in predicting patient transfer. The AUC was used to evaluate the predictive ability to distinguish between patients with and without interest outcomes. The DeLong test42 was performed to compare the predictive ability of our system with those of other systems. The other values were measured using the optimal dividing Youden’s index cut-off value43. The developed scoring system was applied to an independent external validation cohort. All tests were two-sided, and p < 0.01 was considered statistically significant. All statistical analyses were performed on computer with an Intel(R) Core(TM) i5 with 16 GB of RAM memory using Python v. 3.8.8 and SciPy v1.5.2.

Ethical declarations

This study was approved by the SNUH’s institutional review board (H-2105-158-1221) and was conducted in accordance with the relevant guidelines and regulations. The ethics committee waived the need for informed consent considering the study design and adherence to the relevant guideline. In particular, the study data were deidentified to protect privacy and preserve the confidentiality of the study participants.