A risk scoring system to predict progression to severe pneumonia in patients with Covid-19

Rapid outbreak of coronavirus disease 2019 (Covid-19) raised major concern regarding medical resource constraints. We constructed and validated a scoring system for early prediction of progression to severe pneumonia in patients with Covid-19. A total of 561 patients from a Covid-19 designated hospital in Daegu, South Korea were randomly divided into two cohorts: development cohort (N = 421) and validation cohort (N = 140). We used multivariate logistic regression to identify four independent risk predictors for progression to severe pneumonia and constructed a risk scoring system by giving each factor a number of scores corresponding to its regression coefficient. We calculated risk scores for each patient and defined two groups: low risk (0 to 8 points) and high risk (9 to 20 points). In the development cohort, the sensitivity and specificity were 83.8% and 78.9%. In the validation cohort, the sensitivity and specificity were 70.8% and 79.3%, respectively. The C-statistics was 0.884 (95% CI 0.833–0.934) in the development cohort and 0.828 (95% CI 0.733–0.923) in the validation cohort. This risk scoring system is useful to identify high-risk group for progression to severe pneumonia in Covid-19 patients and can prevent unnecessary overuse of medical care in limited-resource settings.

After the Coronavirus disease 2019 (Covid-19) first occurred in China on December 2019, it has become pandemic in March 1 . Due to the surge in Covid-19 patients, lack of medical resources (e.g., negative pressure rooms, personal protective equipment and medical personnel) has become a major problem. During the height of the outbreak in early March, the Ministry of Health and Korea Centers for Disease Control and Prevention established a policy guideline of assigning patients with Covid-19 to different treatment centers based on the level of severity and risk in order to maximize the usage of medical resources. Patients with mild symptoms were assigned to community treatment centers to recover whereas high-risk patients with old age or moderate to severe cases were hospitalized. However, there has been an increasing number of patients classified as mild cases at admission who have developed severe pneumonia during hospitalization.
It is important to provide timely critical care to Covid-19 patients with serious symptoms to reduce number of deaths and burden on overall health system 2 . Therefore, it is key to identify patients who are at higher risk of developing severe pneumonia in early stages of disease. Previous studies show that established and potential risk factors associated with Covid-19 complications are older age (e.g., > 65 years), cardiovascular disease, chronic lung disease, diabetes mellitus, obesity, immunocompromise, end-stage renal disease, and liver disease [3][4][5][6][7] . The purpose of this study is to present a novel scoring system that could be used by clinicians to predict progression to severe pneumonia in Covid-19 patients in earlier stages. Statistical analysis. Potential risk factors for progression to severe pneumonia were evaluated. The Wilcoxon rank sum test was used for continuous variables and the Chi-square test was used for categorical variables to examine the baseline difference between stable and progressed patient group. Most continuous variables that showed significant difference at the baseline were dichotomized. The cut-off values for each continuous variable were determined such that it showed the best discriminatory ability based on the Youden index (sensitivity + specificity − 1). Age group and LDH group were categorized based on the risk probability trend of specific sections within age (by 5 years) and LDH (50 U/L). Age-adjusted logistic regression model was used to identify risk factors. Multivariate logistic regression model with stepwise selection process was used to develop a risk prediction model. P value of 0.05 was used for variable selection process. The model performance was evaluated with respect to its discrimination and calibration ability. Discrimination was quantified using the C-statistics and Hosmer-Lemeshow (H-L) χ 2 statistic was calculated for calibration. The Receiver Operating Characteristics (ROC) Curve for the C-statistics was generated and the square distance between observed prevalence and mean predicted probabilities for each quintile of predicted risk was assessed for calibration. The model performance was also evaluated on the separate validation cohort. Statistical analyses were performed using SAS version 9.4 (SAS institute, Cary, NC), and R package version 3.6.2.

Results
Clinical characteristics of patients. The selection of the study population is illustrated in Fig. 1. A total 640 patients were hospitalized from February 21 through March 31, 2020, the follow-up period ended in April 26, 2020. 21 cases were excluded for the age under 18 years old, 53 patients were excluded for severe pneumonia at admission, and 5 patients were excluded for lack of laboratory examination. Out of 561 patients, 421 were randomly placed in the development cohort and 140 were assigned to the validation cohort, respectively. All patients with stable Covid-19 during hospitalization were followed for more than 4 weeks after admission. Demographics and clinical characteristics of patients who remained stable and who had progressed to severe pneumonia in Covid-19 infection are summarized in Supplementary Independent risk factors associated with progression to severe pneumonia. A risk prediction model was developed in the development cohort. In the univariate analysis, age, male gender, comorbidities, initial chest X-ray abnormality, absolute neutrophil count (ANC), absolute lymphocyte count (ALC), platelet count, blood urea nitrogen (BUN), estimated glomerular filtration rate (eGFR), aspartate aminotransferase (AST), albumin, CRP, CPK and LDH were significantly associated with progression to severe pneumonia. The age-adjusted univariate analysis revealed male gender, comorbidities, initial chest X-ray abnormality, ALC, platelet, BUN, AST, albumin, CRP, CPK and LDH as significantly associated factors. Age, hemoglobin, CRP and LDH were included in the final model by stepwise selection process in the multivariate logistic regression model (Table 1). In the development cohort, the proportions of progression to severe pneumonia in each category were seen as follows: age [< 50 years (0.7%), 50-59 years (6.0%), 60-69 years ( Fig. 1).

Discussion
This study presented a novel scoring system which can be used to predict progression of Covid-19. Identifying Covid-19 patients who are at a higher risk of developing severe symptoms in early stages can inform better medical resource allocation and patient care during massive outbreaks. We found age, CRP, LDH and hemoglobin as independent high-risk factors associated with progression of Covid-19 infection. Some of our findings are consistent with previous studies that identified different risk factors to be associated with poor clinical outcomes in patients with Covid-19 6,7,9,10 . Older age, high Sequential Organ Failure Assessment (SOFA) score, and d-dimer greater than 1 µg/mL at admission were described as potential risk factors for mortality of in-hospital patients with Covid-19 6 . Wang et al. 9 reported older age, dyspnea, lymphopenia, comorbidities (e.g., cardiovascular disease), and acute respiratory distress syndrome (ARDS) as predictors of fatal outcomes in elderly Covid-19 patients. Age, sex, CRP, LDH, lymphocyte count, and features derived from CT images were most reported predictors of severe Covid-19 progression 10 . Jiang X et al. presented research on progression to ARDS through AI framework, and it mentioned higher hemoglobin as one of the risk factors of later development of ARDS. Moreover, higher hemoglobin levels were associated male gender or even unreported tobacco use 11 .
To our knowledge, KDDH scoring system is one of few scoring systems to predict Covid-19 progression at early stage. Ji et al. 12 created a CALL score model with age, comorbidities, lymphopenia and LDH to predict Covid-19 progression at early stage. This is a relatively simple scoring system that only require basic tests for laboratory parameters. However, it has limitations of smaller sample size and no validation. There exists difference in definition of progression to severe Covid-19. CALL model defines deteriorated chest radiologic findings  www.nature.com/scientificreports/ in progression, but the present study only defines patients requiring oxygen therapy as progression 12 . We did not include progression of radiologic findings without hypoxemia in the progression group, as deterioration of radiologic finding tends to be reflected later than clinical course. Gong et al. 13 presented a prognostic nomogram based on seven factors (older age, higher LDH and CRP, direct bilirubin, red blood cell distribution width (RDW), BUN, and lower albumin) to identify patients who are likely to develop severe Covid-19 infection. The nomogram was conducted with multicenter patients and was validated. However, its scoring system shows limited applicability due to score model complexity and greater number of clinical parameters. Liang et al. 14 developed a clinical risk score to predict risk of developing critical illness in Covid-19 patients, which was not associated with progression to severe pneumonia. The KDDH scoring system has advantages of high sensitivity and specificity together with strong calibration that ensures no statistical difference between quintile of predicted risk. Moreover, this is uncomplicated and can predict progression to severe pneumonia by using a simple blood test that can be conducted in outpatient clinics.
High negative predictive value (98.1% in development cohort & 92.9% in validation cohort) of the presented scoring system demonstrates efficacy in identifying low risk patients, who can be managed at other treatment facilities with minimal monitoring or through self-quarantine. High-risk group who are likely to require oxygen therapy can be assigned first to hospitals and receive priority care in early stage. With this scoring system, hospitals can prevent unnecessary overuse of medical care in limited resource settings (e.g., during massive outbreaks), and may reduce mortality through effective allocation of medical resources. During the peak of the Covid-19 outbreak in South Korea, several patient deaths occurred in their homes while waiting to be hospitalized after receiving Covid-19 diagnosis due to shortage of beds. By triaging patients using the KDDH scoring system, such mortality cases could be reduced and be prevented in future outbreaks.
This study has several limitations. First, some study showed that severity of COVID-19 infection is related with smoking, obesity, time between symptoms and hospital admission 15,16 . But there is no data collection of these variables in this study. Second, this study has limitation of being a retrospective, single center cohort study conducted in Daegu, South Korea. However, this hospital treated the largest number of patients with Covid-19, thus its patient data is well-representative of the entire country. Moreover, it is one of the few studies conducted outside of China to develop and validate a risk scoring model for Covid-19 patients. A substantial number of patients were enrolled and followed up for more than 4 weeks without any change observed in final patient outcomes, and the scoring system was well-validated.
In our experience, based on the KDDH scoring system, the low-risk patients received care in the mild patient ward where oxygen therapy was unavailable. The high-risk patients were admitted in the main ward where oxygen therapy and close monitoring could be provided. By concentrating medical personnel who can provide critical care at the main ward, we could treat the critically ill patents effectively. Table 1. Univariate and multiple logistic regression analysis in the development cohort. ALC, absolute lymphocyte count; ALT, Alanine aminotransferase; ANC, absolute neutrophil count; CI, confidence interval; AST, Aspartate aminotransferase; BUN, blood urea nitrogen; CPK, Creatinine phosphokinase; CRP, C-reactive protein; CXR, chest X-ray; eGFR, estimated glomerular filtration rate; LDH, lactate dehydrogenase; OR, odds ratio. www.nature.com/scientificreports/  www.nature.com/scientificreports/ In summary, the presented KDDH scoring system was validated to be a highly informative and successful risk stratification tool to identify Covid-19 patients at higher risk of progression to severe pneumonia. Early adoption of this scoring system can assist optimal usage of limited medical resources in different health facility settings that are undergoing rapid Covid-19 outbreaks.

Data availability
The datasets generated and/or analyzed during the present study are available from the corresponding author on reasonable request.