Development of a multivariable prediction model for severe COVID-19 disease: a population-based study from Hong Kong

Recent studies have reported numerous predictors for adverse outcomes in COVID-19 disease. However, there have been few simple clinical risk scores available for prompt risk stratification. The objective is to develop a simple risk score for predicting severe COVID-19 disease using territory-wide data based on simple clinical and laboratory variables. Consecutive patients admitted to Hong Kong’s public hospitals between 1 January and 22 August 2020 and diagnosed with COVID-19, as confirmed by RT-PCR, were included. The primary outcome was composite intensive care unit admission, need for intubation or death with follow-up until 8 September 2020. An external independent cohort from Wuhan was used for model validation. COVID-19 testing was performed in 237,493 patients and 4442 patients (median age 44.8 years old, 95% confidence interval (CI): [28.9, 60.8]); 50% males) were tested positive. Of these, 209 patients (4.8%) met the primary outcome. A risk score including the following components was derived from Cox regression: gender, age, diabetes mellitus, hypertension, atrial fibrillation, heart failure, ischemic heart disease, peripheral vascular disease, stroke, dementia, liver diseases, gastrointestinal bleeding, cancer, increases in neutrophil count, potassium, urea, creatinine, aspartate transaminase, alanine transaminase, bilirubin, D-dimer, high sensitive troponin-I, lactate dehydrogenase, activated partial thromboplastin time, prothrombin time, and C-reactive protein, as well as decreases in lymphocyte count, platelet, hematocrit, albumin, sodium, low-density lipoprotein, high-density lipoprotein, cholesterol, glucose, and base excess. The model based on test results taken on the day of admission demonstrated an excellent predictive value. Incorporation of test results on successive time points did not further improve risk prediction. The derived score system was evaluated with out-of-sample five-cross-validation (AUC: 0.86, 95% CI: 0.82–0.91) and external validation (N = 202, AUC: 0.89, 95% CI: 0.85–0.93). A simple clinical score accurately predicted severe COVID-19 disease, even without including symptoms, blood pressure or oxygen status on presentation, or chest radiograph results.


INTRODUCTION
The coronavirus disease 2019 has a wide clinical spectrum, with disease severities ranging from completely asymptomatic to the need for intubation and death [1][2][3][4] . For example, those with existing cardiac problems are more likely to suffer from more severe disease life courses [5][6][7][8][9][10][11] , with potential modifier effects from different medication classes [12][13][14] . Aside from comorbidities, numerous risk factors such as high D-dimer 15 , neutrophil 16 , and liver damage 17 and deranged clotting 18 have been associated with disease severity. Such patients may benefit from early aggressive treatment [19][20][21][22][23] . However, to date, there are only a few easy-for-use risk models that can be used for early identification of such at-risk individuals in clinical practice 24,25 . The aim of the study is to extend these previous findings and develop a predictive risk score based on demographic, comorbidity, medication record, and laboratory data using territory-wide electronic health records, without clinical parameters or imaging results. We hypothesized that incorporation of test results on successive time points would improve risk prediction. The model was validated internally, and externally using a single-center cohort from Wuhan.

Basic characteristics
A total of 4442 patients (median age 44.8 years old, 95% CI: [28.9, 60.8]); 50% males) were diagnosed with the COVID-19 infection between 1 January 2020 and 22 August 2020 in Hong Kong public hospitals or their associated ambulatory/outpatient facilities (Table 1). On follow-up until 8 September 2020, a total of 212 patients (4.77%) met the primary outcome of need for intensive care admission or intubation, or death. The survival curve is presented in Fig. 1. The sudden inflexion point at 200 days likely reflects the surge of new cases around this period. The baseline   Supplementary Table 3.

Development of a clinical risk score and validation
Univariate logistic regression analyses are shown in Table 2, which identified the significant risk predictors for the composite outcome. However, for clinical practice, it is impractical to precisely input the values of all variables assessed from the different domains of the health records. Three different models were developed (Tables 3-5), as detailed in the "Methods" section.
The easy-to-use score system is shown in Table 6. Patients meeting the primary outcome (n = 212) have significantly higher risk score (median: 5.13, 95% CI: 3.13-7.43, max: 18.6) than those who did not (median: 1.41, 95% CI: 0.65-5.94, max: 18.2) ( Table 7), indicating the significant risk stratification performance of the clinical risk score (OR: 17.1, 95% CI: 11-26.6) ( Table 8). Survival curves stratified by the dichotomized risk score are shown in Fig. 2, where yellow and blue curves represent the survival analysis for patients with a clinical risk score is larger and smaller than the cut-off, respectively.  COPD chronic obstructive pulmonary disease, ACEI angiotensinogen-converting enzyme inhibitor, ARB angiotensin receptor blocker, APTT activated partial thromboplastin time. *p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001. # indicates that the comparisons were made between patients meeting primary outcome vs. those that did not.      For external validation, a total of 202 patients (48% males) from the Wuhan Heart Hospital were included. Comparisons of different performance measures for the clinical risk score for the Hong Kong cohort (fivefold cross-validation) and Wuhan cohort are detailed in Table 9. Receiver operating characteristic curve (ROC) of predicting adverse composite outcome of COVID-19 patients with the dichotomized risk score cut-off is shown in Fig. 3 (Hong Kong cohort: top panel; Wuhan cohort: bottom panel). As the Wuhan cohort did not routinely have AST tested, this variable was excluded for the performance comparisons. The AUC of 0.86 for the Hong Kong cohort (fivefold cross-validation) and 0.89 for the Wuhan cohort.

DISCUSSION
In this study, we developed a simple clinical score to predict severe COVID-19 disease based on age, gender, medical comorbidities, medication records, and laboratory examination results. We compared the prediction strengths of different criteria for the clinical risk score for out-of-sample validation for the Hong Kong cohort (fivefold cross-validation) and external validation for the Wuhan cohort, with AUC as 0.86 (95% CI: 0.82-0.91) and 0.89 [0.85-0.93], respectively. The derived score system achieved good predictions even without the consideration of clinical parameters such as symptoms, blood pressure, oxygen status on presentation, or chest radiograph results.
COVID-19 disease has placed significant pressures on healthcare systems worldwide. Early risk stratification may better direct the use of limited resources and allow clinicians to triage patients and make clinical decisions based on limited evidence objectively. For example, low-risk patients may require simple monitoring only, while patients that are likely to deteriorate may benefit from intensive drug treatment or intensive care. Currently, the availability of simple clinical risk scores for risk Table 6. Easy-to-use score system for early prediction of severe    stratification is limited. The COVID-GRAM predicts development of critical illness, based on symptoms, radiograph results, clinical and laboratory details 24 . Similarly, the 4C Mortality Score included eight variables readily available at initial hospital assessment: age, sex, number of comorbidities, respiratory rate, peripheral oxygen saturation, level of consciousness, urea level, and C-reactive protein (score range 0-21 points) 25 . These scores produced moderately accurate predictions with C-index values of 0.86 and 0.61-0.76, respectively. A systematic review and meta-analysis have recently summarized different risk scores that have been developed by investigators from different countries 26 . As reported, the most frequently reported predictors were age, clinical status such as temperature, imaging results from chest radiography, and lymphocyte count. Recently, a study including 3927 patients from 33 hospitals developed the COVID-19 Mortality Risk (CMR) tool using the XGBoost algorithm 27 . This score is based on age, blood urea nitrogen, CRP, creatinine, glucose, AST, and platelet counts. Different teams in our country have already used a data-driven approach to develop predictive risk models for COVID-19 to predict viral transmission 28,29 , adverse outcomes 30,31 and even to determine effects of risk perceptions on behaviors in response to the outbreak 32 . For example, our team recently developed a risk model based on non-linear interactions between different variables to predict intensive care unit admission using a tree-based machine learning model 30 . The above models are based on individuallevel patient data. Where these are not available, investigators have successfully developed a useful model by using aggregate epidemiological reports of COVID-19 case fatality events 33 .
In this study, with an expanded cohort, we developed a simple and easy-to-use model was based on past comorbidity and laboratory data only, without needing clinical assessment details or chest imaging interpretation. The model based on test results taken on the day of admission already demonstrated an excellent predictive value with a C-statistic of 0.89. Incorporation of test results on successive time points did not further improve risk prediction, indicating that initial data are sufficient to produce accurate predictions of severe disease. Our model can aid clinical decision making as early intervention may be associated with better outcomes [19][20][21][22][23] .
The major limitation of this study is that it is based on a territory-wide cohort from a single city in China (Hong Kong). However, the risk score was independently validated using an external cohort from another city (Wuhan). We recognize that the baseline demographic and clinical characteristics of COVID-19 patients may differ in other countries. The model should be further externally validated using patient data involving from other geographical regions to allow further generalization.
In conclusion, simple clinical score based on only demographics, comorbidities, medication records, and laboratory tests accurately predicted severe COVID-19 disease, even without including symptoms on presentation, blood pressure, oxygen status, or chest radiograph results. The model based on test results taken on the day of admission showed an excellent predictive value. Incorporation of test results on successive time points did not further improve risk prediction. Both out-of-sample fivefold cross-validation on Hong Kong cohort and independent external validation on Wuhan cohort demonstrated the significant risk stratification performance of the derived score system for severe COVID-19 disease. The presented score system tool used commonly available clinical and laboratory results and does not require imaging results or advanced testing, and therefore can be particularly useful in facilities with constrained resources or remote hospitals with limited diagnostic capabilities such as computed tomography scans.

Study design and population
This study was approved by the Institutional Review Board of the University of Hong Kong/Hospital Authority Hong Kong West Cluster. The need for informed consent was waived by the Ethics Committee owing to the retrospective and observational nature of this study. This was a retrospective, territory-wide cohort study of patients undergoing COVID-19 RT-PCR testing between 1 January 2020 and 22 August 2020 in Hong Kong, China. The patients were identified from the Clinical Data Analysis and Reporting System (CDARS), a territory-wide database that centralizes patient information from 43 local hospitals and their associated ambulatory and outpatient facilities to establish comprehensive medical data, including clinical characteristics, disease diagnosis, laboratory results, and drug treatment details. The system has been previously used by both our team and other teams in Hong Kong 34

Outcomes and statistical analysis
The primary outcome was a composite of need for intensive care admission, intubation, or all-cause mortality with follow-up until 8 September 2020. Mortality data were obtained from the Hong Kong Death Registry, a population-based official government registry with the registered death records of all Hong Kong citizens linked to CDARS. Patients who passed away 30 days later or longer after discharge were excluded. The need for ICU admission and intubation were extracted directly from CDARS. There was no adjudication of the outcomes as this relied on the ICD-9 coding or a record in the death registry. However, the coding was performed by the clinicians or administrative staff, who were not involved in the mode development.

Development of different scoring systems
Three different models were developed. Model 1: optimum cut-off values of different variables at baseline were obtained from receiver operating characteristic (ROC) analysis. Laboratory examinations on for each successive 24 h was compared to cut-off to determine whether the criterion was met at each time point.
Model 2: the criterion was met if the value was abnormal by standard laboratory criteria, without consideration of optimal cut-off values.
Model 3: laboratory test results are compared to the criteria without cutoff values, to determine if they were met on successive testing. For example, if a particular criterion is met on day 1, then they will automatically fulfill the criteria for subsequent days.
A simple and easy-to-use score system was built based on beta coefficients using logistic regression analysis. The risk score of each COVID-19 patient was then calculated. The derived score system was evaluated within-sample fivefold cross testing set and out-of-sample dataset from Wuhan for external validation. The model was not recalibrated after validation.

External validation
For external validation, patients admitted to the Wuhan Asia General Hospital 38 , Wuhan, China, between 10 February and 10 March 2020, were included. Diagnosis of COVID-19 was based on positive PCR test and ground glass shadows in the lungs on computed tomography scan, with follow-up 2 weeks post-discharge. Lipid and aspartate aminotransferase were not routinely collected and therefore not included for validation.

Performance of the score
Performance of the score system was evaluated based on its ability to discriminate the composite outcome for each population. The results for in-sample testing set, and for external out-of-sample validation cohort were reported, with the corresponding CIs. The area under the curve (AUC), accuracy, specificity, and precision were computed for all patient subpopulations. Receiver operating characteristic (ROC) curves were created for each of the cohorts with the derive score system to predict the adverse composite outcome. All statistical tests were two-tailed and considered significant if p value < 0.001. They were performed using RStudio software (Version: 1.1.456) and Python (Version: 3.6).

Reporting summary
Further information on experimental design is available in the Nature Research Reporting Summary linked to this paper.