Artificial intelligence-enhanced electrocardiography for early assessment of coronavirus disease 2019 severity

Despite challenges in severity scoring systems, artificial intelligence-enhanced electrocardiography (AI-ECG) could assist in early coronavirus disease 2019 (COVID-19) severity prediction. Between March 2020 and June 2022, we enrolled 1453 COVID-19 patients (mean age: 59.7 ± 20.1 years; 54.2% male) who underwent ECGs at our emergency department before severity classification. The AI-ECG algorithm was evaluated for severity assessment during admission, compared to the Early Warning Scores (EWSs) using the area under the curve (AUC) of the receiver operating characteristic curve, precision, recall, and F1 score. During the internal and external validation, the AI algorithm demonstrated reasonable outcomes in predicting COVID-19 severity with AUCs of 0.735 (95% CI: 0.662–0.807) and 0.734 (95% CI: 0.688–0.781). Combined with EWSs, it showed reliable performance with an AUC of 0.833 (95% CI: 0.830–0.835), precision of 0.764 (95% CI: 0.757–0.771), recall of 0.747 (95% CI: 0.741–0.753), and F1 score of 0.747 (95% CI: 0.741–0.753). In Cox proportional hazards models, the AI-ECG revealed a significantly higher hazard ratio (HR, 2.019; 95% CI: 1.156–3.525, p = 0.014) for mortality, even after adjusting for relevant parameters. Therefore, application of AI-ECG has the potential to assist in early COVID-19 severity prediction, leading to improved patient management.


COVID-19 severity classification.
After transitioning from the ED to the COVID-19 dedicated wards, we classified patients based on the World Health Organization guideline into two categories: group 1 with mildto-moderate illness, defined by not requiring oxygen therapy or low-flow oxygen therapy < 5 L via nasal prongs; and group 2 with severe-to-critical illness, characterized by the need of high-flow oxygen, continuous positive airway pressure, invasive mechanical ventilation, or extracorporeal membrane oxygenation [ECMO] [11][12][13] .
Data collection and covariates.All data in the ECGs were acquired at a sampling rate of 500 Hz using a GE-Marquette ECG machine (General Electric Healthcare, Chicago, Illinois, United States).The raw data were stored as XML documents using the MUSE data management system in relational databases.All ECG data were manually adjudicated by two electrophysiologists.We included the demographic, laboratory, clinical, and ECG covariates in our prediction models.The demographic covariates included age, sex, ethnicity, and insurance type, and the vital signs included oxygen saturation, mean blood pressure, body temperature, and ventricular rate.ECG characteristics and classify the data.We extracted and analyzed the XML data from the MUSE data management system, and to minimize the artifacts, all data files were stored in the XML format on a GE ECG machine (General Electric Healthcare, Chicago, Illinois, United States).

AI algorithm model for predicting
The ECGs were originally recorded from 12 leads; however, because of the device's data storage method, only data from eight leads were stored, excluding lead III, aVR, aVL, and aVF.Simple arithmetic operations can be used to calculate the data from those four leads, and it is common to apply these processes to approximate the data 14 .Therefore, only the eight recorded signals of leads I, II, V1, V2, V3, V4, V5, and V6 were used in this study.The signals from each lead were simultaneously measured for 10 s, and when the Base64-encoded value was read, eight one-dimensional arrays for each XML file were obtained.As a 10-s signal has multiple pulses and heart rate varies from person to person, we obtained approximately 10 or more pulses per person (Fig. 2).We specified the position of the P, QRS, and T waves and analyzed those waves separately to avoid any bias from the variable heart rate.We used an algorithm to detect the R peaks, and the P, QRS, and T waves were located afterwards.We then analyzed each wave using AI, which calculated the scores for each wave.The result was presented by calculating the mean score.Additionally, we utilized the Class Activation Map (CAM) to highlight ECG segments that indicate regions significantly contributing to the severity classification in COVID-19 patients 15 .

Confirmation of the performance of AI-ECGs for predicting COVID-19 severity.
We trained and validated the AI-enhanced ECGs to assess the severity of COVID-19 in patients who underwent their initial ECG at our ED before severity classification.We tested the accuracy of the AI-ECG using an external dataset.We compared the area under the receiver operating characteristic curve (AUROC) to confirm the accuracy of the developed AI-ECGs.AUROC was calculated using the AI-ECG in the presence of severe-to-critical illness in COVID-19 patients with the Early Warning Scores (EWSs), including the Modified Early Warning Score (MEWS), National Early Warning Score (NEWS), and the Worthing Physiological Scoring System (WPS).Those scores were calculated after relevant data assessment [16][17][18] .

Statistical analysis.
Continuous variables are reported as means ± standard deviations or medians and interquartile ranges, and categorical variables are presented as percentages and frequencies.Comparisons between groups were performed using the independent sample t-test or chi-square test.The performance of the AI model was measured using the AUROC to predict the dataset accuracy, recall (sensitivity), specificity, and F1 score.Recall is the ratio of correctly predicted positive observations to the total observations, while the F1 score (balanced F-score) is the harmonic mean of the precision and recall.In addition, to predict mortality in the admission of COVID-19 patients, we performed a Cox proportional-hazards model regression analysis.For all variables, p < 0•05 was considered statistically significant.Statistical analyses were performed using SPSS statistical software for Windows (version 21.0; IBM, Armonk, New York, United States).

Results
Patient characteristics.The baseline characteristics, comorbidities, and laboratory and electrocardiographic findings of the enrolled patients are shown in Table 1.The mean age of the 1,453 participants was 59.7 ± 20.1 years, and 54.2% of the patients were male.Group 1 (mild-to-moderate illness, with no need for oxygen therapy or low-flow oxygen therapy) included 892 patients, while group 2 (severe-to-critical illness and required higher treatment than high-flow oxygen [5 L via nasal prong]) included 561 patients.For both datasets A and B, the proportions of patients with hypertension (p < 0.001), diabetes mellitus (p < 0.001), and strokes (p < 0.001) were significantly greater in group 2 than in group 1. Regarding the laboratory findings, the white blood cell and platelet counts and C-reactive protein, N-terminal-pro hormone B-type natriuretic peptide, creatine phosphokinase, creatine kinase-MB, blood urea nitrogen, and serum creatinine levels were also higher in group 2 than in group 1 in datasets A and B. On comparing the ECG findings between the two groups, we found that the patients in group 2 had a higher heart rate, prolonged QRS duration, and longer corrected QT (QTc) interval than those in group 1.
Clinical outcomes and the EWS according to the COVID-19 classification.The in-hospital mortality rate was 8.3% (121 patients), and all patients belonged to group 2. The proportions of heart failure, intensive care unit care, invasive mechanical ventilation, and ECMO were significantly higher in group 2 than in group 1 (p < 0.001).Overall, the duration of hospitalization was significantly longer in group 2 than in group 1 (p < 0.001; Table 2).In both datasets A and B, the MEWS, NEWS and WPS scores were significantly higher in group 2 than in group 1 (p < 0.001; Table 3).

AI-ECG as a significant predictor of mortality risk in admission of COVID-19 patients.
Table 5 presents the analysis of risk factors associated with mortality in COVID-19 patients during hospitalization.In the Cox proportional hazards models for mortality in the admission of COVID-19 patients, after adjusting for age, sex, and relevant variables, including the EWS systems, the AI-ECG showed a significantly higher hazard ratio of 2.019 (95% CI: 1.156-3.525,p = 0.014; Table 5).

ECG wave analysis using class activation maps.
We performed a CAM to demonstrate ECG waveforms for COVID-19 patients throughout severity classifications to better understand the impact of COVID-19 on ECG.As illustrated in the Supplementary Figure, the activation map identified the P wave, the onset of the QRS complex and the T wave as pivotal regions for patients with mild-to-moderate illness, while the QRS complex and the T wave were prominently highlighted for patients with severe-to-critical illness (Supplementary Figure ).

Discussion
We developed a new AI algorithm using initial 12-lead ECGs to identify disease severity and prognosis in patients hospitalized with COVID-19.The algorithm demonstrated reasonable accuracy for internal and external validations.To the best of our knowledge, this is the first study to develop a deep neural network that assesses the severity of COVID-19 based on initial ECGs at admission.Our algorithm can help identify patients who are more likely to develop severe-to-critical illness, thus enabling the effective deployment of medical resources and provision of adequate patient care in the early stages of a large-scale outbreak.Our AI algorithm showed the predictive value of an ECG in identifying COVID-  often progressing within a few days from disease onset, underscores the importance of timely transfers from these facilities to hospitals equipped to manage severe to critical conditions [19][20][21] .The use of relatively simple, non-invasive, and cost-effective examinations, like an ECG, can be advantageous in these circumstances.This study was conducted with the anticipation that this approach would facilitate the efficient allocation of medical resources and consequently improve patient prognoses in upcoming pandemic scenarios similar to COVID-19.

Impact of COVID-19 on ECG.
In this study, patients with severe-to-critical illness had a higher heart rate, prolonged PR interval, QRS duration, and corrected QT interval than patients with mild-to-moderate illness.This may be explained by the effect of coronaviruses on both cardiac function and electrophysiology [22][23][24] .COVID-19 affects the QT interval independently of factors that may cause QT prolongation; additionally, it is associated with severe cardiac inflammation and renin-angiotensin system activation, known to affect repolarization 18,23,25,26 .Therefore, acute COVID-19 may subtly and pluralistically affect the ECG results 27 .Furthermore, cardiac depolarization and repolarization are complex and delicate processes that can be affected by cardiac dysfunction, metabolic and electrolyte imbalances, and medications, which are factors that affect patients with COVID-19.Moreover, QT prolongation is also a marker of systemic illness severity and increased mortality, as well as an independent risk factor for sudden death both in the general population and those in the ICU 22 .
Previous studies indicate that several ECG changes, such as prolonged PR interval, P wave duration, QT interval, and left ventricular hypertrophy, have been identified in ICU patients who died 28 .Heart failure and asymptomatic severe left ventricular dysfunction have both been successfully detected by deep neural networks based on the ECG 29 .Analyzing ECG waveforms of COVID-19 patients across severity classifications, our CAM analysis revealed distinct patterns.In patients with mild-to-moderate illness, the algorithm highlighted the importance of the P wave, the onset of the QRS complex, and T wave.However, the QRS complex and the T wave emerged as critical areas for those with severe-to-critical disease.Although we cannot fully understand and interpret the decision-making approach in deep learning algorithms due to the "black box" limitation, our results from this analysis support the assumption that ECG changes in mild-to-moderate illness are related to atrial electrical abnormalities, early alterations in ventricular depolarization patterns, and ventricular repolarization  www.nature.com/scientificreports/abnormalities.Conversely, the severe-to-critical disease exhibited more extensive ventricular depolarization and repolarization abnormalities.These observations suggest atrial and ventricular electrical remodeling and their potential impact on the decision-making process in deep learning algorithms 30 .Thus, such electrocardiographic changes may help with the risk stratification of severity and prognosis in patients with COVID-19.

AI-ECG and previous early warning scoring systems predict the severity in patients with COVID-19.
EWSs are widely used in clinical practice to help doctors estimate the risk of deterioration, monitor the patient's evolution, and make clinical decisions to enhance the critical patient's safety.Many EWS models have been developed, including the NEWS, MEWS, and WPS 31 .These models are based on the effects of COVID-19 on the cardiovascular and pulmonary systems and several extrapulmonary organs 32 .However, limitations in assessing the vital signs, consciousness, oxygen saturation, and other indirect indicators may be overcome by the AI-based approach based on the ECG.
In a recent study, the AUROCs for the NEWS and MEWS in predicting mortality were shown to be 0.809 (95% CI: 0.727-0.891)and 0.670 (95% CI: 0.573-0.767),respectively 31 .We demonstrated a reasonable accuracy of COVID-19 severity prediction in both internal and external validations.In our study, the developed AI using the initial ECG combined with the EWS for detecting severe-to-critical illness in COVID-19 presented a better performance compared with that of the physiologic scoring systems, MEWS, NEWS, and WPS (AUC of 0.833 [95% CI: 0.830-0.835]).In the early stage of COVID-19, ECG-based AI demonstrated better performance in predicting the progression to severe-to-critical illness than the physiologic scoring systems.
This study had some limitations.First, as this was a retrospective study conducted in a single tertiary hospital in Korea, it is necessary to validate the model with patients in other hospitals and countries.A prospective study is warranted to establish the model's usefulness as a new, feasible, and noninvasive screening tool.Second, although we used CAM to visualize ECG waveforms for COVID-19 patients across various severity classifications to understand better COVID-19's impact on ECG, the interpretation of deep learning models and the underlying rationale of AI decision-making remain inherently challenging due to the nature of AI.Third, given the heterogeneity of the patient population, it is possible that the use of drugs that affect the ECG (e.g., antiarrhythmic drugs) may also have affected the network output.Fourth, it remains unclear whether the changes in the ECGs in the presence of a fever or acute respiratory distress associated with the presence of other infectious agents differed from those of COVID-19.Moreover, SARS-CoV-2 is constantly changing.Many notable strains have emerged, including the Alpha, Beta, Delta, and Omicron, and it remains unclear whether COVID-19-related ECG changes differ if the new mutation is more aggressive, highly contagious, vaccine-resistant, can cause more severe illness, or all of the above, compared with the original strain of the virus.Thus, newer variants may require prospective research into what our AI algorithms will accurately predict.Fifth, despite the favorable performance of our deep learning algorithm, overcoming false positives and negatives to identify the optimal treatment and predict the prognosis remains a critical issue.Although it is difficult to fully rely on the AI-ECG, the algorithm could predict disease severity using the initial 12-lead ECG, which is a rapid, simple, and inexpensive point-ofcare test.Sixth, utilizing ECGs obtained from local health centers, private clinics, and primary and secondary hospitals might potentially be more closely aligned with the initial onset following a COVID-19 diagnosis.However, almost all patients were rapidly transferred to our hospital's ED without ECGs, resulting in a minimal time discrepancy from disease onset.Seventh, while our research robustly tested our model compared to established ones and used a separate dataset for validation, the single-center nature coupled with challenges from an imbalanced dataset and limited patients underscores the need for a large-scale study.Finally, recent studies have linked COVID-19 exposure to a higher risk of adverse cardiovascular outcomes, even after recovery from acute illness 33,34 .Consequently, further research with long-term follow-up in patients with COVID-19 complicated with cardiovascular involvement is required to better understand the long-term cardiovascular consequences of COVID-19 on the AI-ECG.
In conclusion, AI using the initial 12-lead ECG demonstrated reasonable performance for predicting COVID-19 severity in hospitalized patients.This AI algorithm could significantly improve COVID-19 severity screening, both efficiently and inexpensively, considering the limited availability of medical resources in a recurrent pandemic.

Figure 1 .
Figure 1.Study flow diagram showing the selection of patients with COVID-19 and the creation of the study datasets.ECGs were allocated to the training, internal validation, and external validation datasets using Data A and B. ECG electrocardiography; COVID-19 coronavirus disease 2019.

Figure 2 .
Figure 2. Description of the artificial intelligence algorithm for predicting the severity in patients with COVID-19.COVID-19 coronavirus disease 2019.

Figure 3 .
Figure 3. Multiclass ROC curves with deep neural networks.(A) Internal validation for predicting the severity of COVID-19 patients using dataset A. (B) External validation for predicting the severity of COVID-19 patients using dataset A. COVID-19 coronavirus disease 2019; ROC receiver operating characteristic.

Table 1 .
19severity using a deep learning algorithm.Compared to the previously commonly used physiological scoring systems, the AI-ECG had reliable performance in estimating the severity of COVID-19 in patients.The AI-ECG, combined with the EWS, had a more desirable performance Patient characteristics and laboratory and electrocardiographic findings at enrollment.
patients at high risk of progressing to severe disease within the limitations of medical resources.Rapid and accurate point-of-care testing using this AI method can improve patient prognosis by focusing on effective critical care treatment in a limited healthcare system.Furthermore, AI-ECG algorithms have the potential to be applied to recently available smartphones and wearable ECGs.Therefore, AI-ECG provides a fast, reliable, efficient, inexpensive, harmless, and easily accessible method for severity screening and predicting the prognosis of COVID-19.Further, in response to the pandemic, most countries have established community treatment centers for COVID-19 patients or advocated for home isolation to manage medical resources efficiently, particularly regarding bed availability.The rapid clinical deterioration typically experienced by COVID-19 patients, fibrillation; QRSd: QRS duration; TIA: transient ischemic attack; SBP, systolic blood pressure; DBP, diastolic blood pressure; HR, heart rate; RR, respiratory rate; WBC, white blood cells, Hb, hemoglobin; PLT, platelets; CRP, C-reactive protein; NT-proBNP, N-terminal pro-brain natriuretic peptide; CK-MB, Creatine kinase-MB; BUN, blood urea nitrogen.

Table 2 .
Clinical outcomes according to the COVID-19 classification.Values are expressed as the n (%) or means ± standard deviations.ICU, intensive care unit; ECMO, extracorporeal membrane oxygenation.*p-value of Student's t-test or chi-square test between group 1 and group 2.

Table 3 .
A comparison among the Modified Early Warning Score, National Early Warning Score, and Worthing Physiological Scoring System according to disease severity in patients with COVID-19.Values are expressed as the n (%) or means ± standard deviations.*p-value of the Student's t test or chi-square test between group 1 and group 2.

Table 4 .
AI model performance for predicting COVID-19 severity in hospitalized patients.Data in parentheses represent 95% confidence intervals.AI: artificial intelligence; avg., average; ECG, electrocardiography; EWS, Early Warning Scores; MEWS, Modified Early Warning Score; NEWS, National Early Warning Score; WPS, Worthing Physiological Scoring System.*F1 Score (balanced F-score) is the harmonic mean of precision and recall and was calculated as follows: F1 score = 2 (precision × recall) / (precision + recall).

Table 5 .
Cox regression analysis for mortality in admission of COVID-19 patients.AI: artificial intelligence; CI, confidence interval; ECG, electrocardiography; MEWS, HR, hazard ratio; Modified Early Warning Score; NEWS, National Early Warning Score; WPS, Worthing Physiological Scoring System.