Prediction of mortality from 12-lead electrocardiogram voltage data using a deep neural network

Raghunath, Sushravya; Ulloa Cerna, Alvaro E.; Jing, Linyuan; vanMaanen, David P.; Stough, Joshua; Hartzel, Dustin N.; Leader, Joseph B.; Kirchner, H. Lester; Stumpe, Martin C.; Hafez, Ashraf; Nemani, Arun; Carbonati, Tanner; Johnson, Kipp W.; Young, Katelyn; Good, Christopher W.; Pfeifer, John M.; Patel, Aalpen A.; Delisle, Brian P.; Alsaid, Amro; Beer, Dominik; Haggerty, Christopher M.; Fornwalt, Brandon K.

doi:10.1038/s41591-020-0870-z

Letter
Published: 11 May 2020

Prediction of mortality from 12-lead electrocardiogram voltage data using a deep neural network

Sushravya Raghunath ORCID: orcid.org/0000-0001-7623-3095¹,
Alvaro E. Ulloa Cerna¹,
Linyuan Jing¹,
David P. vanMaanen¹,
Joshua Stough^1,2,
Dustin N. Hartzel³,
Joseph B. Leader ORCID: orcid.org/0000-0002-4101-5423³,
H. Lester Kirchner⁴,
Martin C. Stumpe⁵,
Ashraf Hafez⁵,
Arun Nemani⁵,
Tanner Carbonati ORCID: orcid.org/0000-0001-6438-2800⁵,
Kipp W. Johnson⁵,
Katelyn Young⁶,
Christopher W. Good⁷,
John M. Pfeifer⁸,
Aalpen A. Patel⁹,
Brian P. Delisle¹⁰,
Amro Alsaid⁷,
Dominik Beer⁷,
Christopher M. Haggerty^1,7^na1 &
…
Brandon K. Fornwalt ORCID: orcid.org/0000-0002-6231-9442^1,7,9^na1

Nature Medicine volume 26, pages 886–891 (2020)Cite this article

9814 Accesses
141 Citations
69 Altmetric
Metrics details

Subjects

Abstract

The electrocardiogram (ECG) is a widely used medical test, consisting of voltage versus time traces collected from surface recordings over the heart¹. Here we hypothesized that a deep neural network (DNN) can predict an important future clinical event, 1-year all-cause mortality, from ECG voltage–time traces. By using ECGs collected over a 34-year period in a large regional health system, we trained a DNN with 1,169,662 12-lead resting ECGs obtained from 253,397 patients, in which 99,371 events occurred. The model achieved an area under the curve (AUC) of 0.88 on a held-out test set of 168,914 patients, in which 14,207 events occurred. Even within the large subset of patients (n = 45,285) with ECGs interpreted as ‘normal’ by a physician, the performance of the model in predicting 1-year mortality remained high (AUC = 0.85). A blinded survey of cardiologists demonstrated that many of the discriminating features of these normal ECGs were not apparent to expert reviewers. Finally, a Cox proportional-hazard model revealed a hazard ratio of 9.5 (P < 0.005) for the two predicted groups (dead versus alive 1 year after ECG) over a 25-year follow-up period. These results show that deep learning can add substantial prognostic information to the interpretation of 12-lead resting ECGs, even in cases that are interpreted as normal by physicians.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Summary of model performance as area under the receiver operating characteristic curve for predicting 1-year mortality.**

**Fig. 2: Receiver operating characteristic curves indicating the optimal operating point and corresponding Kaplan–Meier survival curves.**

Towards artificial intelligence-based learning health system for population-level mortality prediction using electrocardiograms

Article Open access 06 February 2023

A deep learning-based electrocardiogram risk score for long term cardiovascular death and disease

Article Open access 12 September 2023

DDxNet: a deep learning model for automatic interpretation of electronic health records, electrocardiograms and electroencephalograms

Article Open access 02 October 2020

Data availability

All requests for raw and analyzed data and related materials, excluding programming code, will be reviewed by our legal department to verify whether the request is subject to any intellectual property or confidentiality constraints. Requests for patient-related data not included in the paper will not be considered. Any data and materials that can be shared will be released via a material transfer agreement for non-commercial research purposes.

Code availability

Programming code related to data preprocessing and model specification will be made available under GNU General Public License version 3 upon request to the corresponding author.

References

Fye, W. B. A history of the origin, evolution, and impact of electrocardiography. Am. J. Cardiol. 73, 937–949 (1994).
Article CAS Google Scholar
Chesebro, J. H. et al. Thrombolysis in Myocardial Infarction (TIMI) trial, phase I: a comparison between intravenous tissue plasminogen activator and intravenous streptokinase. Clinical findings through hospital discharge. Circulation 76, 142–154 (1987).
Article CAS Google Scholar
Eagle, K. A. et al. A validated prediction model for all forms of acute coronary syndrome. JAMA 291, 2727–2733 (2004).
Article CAS Google Scholar
Kenchaiah, S. et al. Obesity and the risk of heart failure. N. Engl. J. Med. 347, 305–313 (2002).
Article Google Scholar
Levy, W. C. et al. The Seattle Heart Failure Model: prediction of survival in heart failure. Circulation 113, 1424–1433 (2006).
Article Google Scholar
Goldman, L. et al. Multifactorial index of cardiac risk in noncardiac surgical procedures. Surv. Anesthesiol. 22, 482 (1978).
Article Google Scholar
Lloyd-Jones, D. M. et al. Use of risk assessment tools to guide decision-making in the primary prevention of atherosclerotic cardiovascular disease. Circulation 139, e1162–e1177 (2019).
Article Google Scholar
Hwang, W., Chang, J., LaClair, M. & Paz, H. Effects of integrated delivery system on cost and quality. Am. J. Manag. Care 19, e175–e184 (2013).
PubMed Google Scholar
Motwani, M. et al. Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. Eur. Heart J. 52, 468–476 (2016).
Google Scholar
Curry, S. J. et al. Screening for cardiovascular disease risk with electrocardiography: US Preventive Services Task Force recommendation statement. JAMA 319, 2308–2314 (2018).
Article Google Scholar
Lanza, G. A. The electrocardiogram as a prognostic tool for predicting major cardiac events. Prog. Cardiovasc. Dis. 50, 87–111 (2007).
Article Google Scholar
Pan, J. & Tompkins, W. J. A real-time QRS detection algorithm. IEEE Trans. Biomed. Eng. 32, 230–236 (1985).
Article CAS Google Scholar
Belforte, G., De Mori, R. & Ferraris, F. A contribution to the automatic processing of electrocardiograms using syntactic methods. IEEE Trans. Biomed. Eng. 26, 125–136 (1979).
Article CAS Google Scholar
Madeiro, J. P. V., Cortez, P. C., Marques, J. A. L., Seisdedos, C. R. V. & Sobrinho, C. R. M. R. An innovative approach of QRS segmentation based on first-derivative, Hilbert and wavelet transforms. Med. Eng. Phys. 34, 1236–1246 (2012).
Article Google Scholar
Köhler, B. U., Hennig, C. & Orglmeister, R. The principles of software QRS detection. IEEE Eng. Med. Biol. Mag. 21, 42–57 (2002).
Article Google Scholar
Addison, P. S. Wavelet transforms and the ECG: a review. Physiol. Meas. 26, R155–R199 (2005).
Article Google Scholar
Acharya, U. R. et al. A deep convolutional neural network model to classify heartbeats. Comput. Biol. Med. 89, 389–396 (2017).
Article Google Scholar
LeBlanc, A. R. Quantitative analysis of cardiac arrhythmias. Crit. Rev. Biomed. Eng. 14, 1–43 (1986).
CAS PubMed Google Scholar
Luz, E. J., Schwartz, W. R., Cámara-Chávez, G. & Menotti, D. ECG-based heartbeat classification for arrhythmia detection: a survey. Comput. Methods Programs Biomed. 127, 144–164 (2016).
Article Google Scholar
Rahhal, M. M. A. et al. Deep learning approach for active classification of electrocardiogram signals. Inf. Sci. 345, 340–354 (2016).
Article Google Scholar
Liu, W. et al. Real-time multilead convolutional neural network for myocardial infarction detection. IEEE J. Biomed. Health Inform. 22, 1434–1444 (2017).
Article Google Scholar
Goodfellow, S. D. et al. Towards understanding ECG rhythm classification using convolutional neural networks and attention mappings. in Machine Learning for Healthcare Conf. 83–101 (2018).
Yu, S. N. & Chen, Y. H. Electrocardiogram beat classification based on wavelet transformation and probabilistic neural network. Pattern Recog. Lett. 48, 1142–1150 (2007).
Article Google Scholar
Asl, B. M., Setarehdan, S. K. & Mohebbi, M. Support vector machine-based arrhythmia classification using reduced features of heart rate variability signal. Artif. Intell. Med. 44, 51–64 (2008).
Article Google Scholar
Karpagachelvi, S., Arthanari, M. & Sivakumar, M. Classification of ECG signals using extreme learning machine. Comput. Inf. Sci. https://doi.org/10.5539/cis.v4n1p42 (2014).
Kampouraki, A., Manis, G. & Nikou, C. Heartbeat time series classification with support vector machines. IEEE Trans. Inf. Technol. Biomed. 13, 512–518 (2009).
Article Google Scholar
Hannun, A. Y. et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med. 25, 65–69 (2019).
Article CAS Google Scholar
Smith, S. W. et al. A deep neural network learning algorithm outperforms a conventional algorithm for emergency department electrocardiogram interpretation. J. Electrocardiol. 52, 88–95 (2019).
Article Google Scholar
Attia, Z. I. et al. Screening for cardiac contractile dysfunction using an artificial intelligence-enabled electrocardiogram. Nat. Med. 25, 70–74 (2019).
Article CAS Google Scholar
Attia, Z. I. et al. Prospective validation of a deep learning ECG algorithm for the detection of left ventricular systolic dysfunction. J. Cardiovasc. Electrophysiol. 30, 668–674 (2019).
Article Google Scholar
Attia, Z. I. et al. An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction. Lancet 6736, 1–7 (2019).
Google Scholar
Chen, T. & Guestrin, C. XGBoost: reliable large-scale tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (2016).
Chen, T. & He, T. Higgs Boson discovery with boosted trees. JMLR Work. Conf. Proc. 42, 69–80 (2015).
CAS Google Scholar
D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care: the Framingham Heart Study. Circulation 117, 743–753 (2008).
Article Google Scholar
Quan, H. et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med. Care 43, 1130–1139 (2005).
Article Google Scholar
Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. In 2017 IEEE International Conference on Computer Vision (ICCV) 618–626 (IEEE, 2017).
Springenberg, J. T., Dosovitskiy, A., Brox, T. & Riedmiller, M. Striving for simplicity: the all convolutional net. Preprint at https://arxiv.org/abs/1412.6806 (2014).
Davie, A. P. et al. Value of the electrocardiogram in identifying heart failure due to left ventricular systolic dysfunction. Br. Med. J. 312, 222 (1996).
Article CAS Google Scholar
Hedberg, P. et al. Electrocardiogram and B-type natriuretic peptide as screening tools for left ventricular systolic dysfunction in a population-based sample of 75-year-old men and women. Am. Heart J. 148, 524–529 (2004).
Article CAS Google Scholar
van Buuren, S. & Groothuis-Oudshoorn, K. mice: multivariate imputation by chained equations in R. J. Stat. Softw. 45, https://doi.org/10.18637/jss.v045.i03 (2011).
Nair, V. & Hinton, G. E. Rectified linear units improve restricted Boltzmann machines. Proceedings of the ICML 27, 807–814 (2010).
Lin, M., Chen, Q. & Yan, S. Network in network. Preprint at https://arxiv.org/abs/1312.4400 (2013).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).
Hochreiter, S. & Schmidhuber, J. J. Long short-term memory. Neural Comput. 9, 1–32 (1997).
Article CAS Google Scholar
Prechelt, L. Early stopping—but when? in Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science 7700, 55–69 (Springer, 2012).
Shah, S. J. et al. Phenomapping for novel classification of heart failure with preserved ejection fraction. Circulation 131, 269–279 (2015).
Article Google Scholar
Carreiras, C. et al. BioSPPy: biosignal processing in Python https://github.com/PIA-Group/BioSPPy (2018).
Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. Am. Stat. Assoc. 53, 457–481 (1958).
Article Google Scholar
Cox, D. R. Regression models with life tables. J. R. Stat. Soc. B 34, 187–220 (1972).
Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge C. Nevius and B. McCarty for their help in submitting the IRB approval for the study and developing a scheduler for efficient computational scheduling of the study experiments. The authors also acknowledge the time and contribution of the following cardiologists who performed the survey reported in the paper: N. Mead, B. Carry, G. Yost, S. Siddiqi, T. Rizwan and B. Durr.

Author information

These authors contributed equally: Christopher M. Haggerty, Brandon K. Fornwalt.

Authors and Affiliations

Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
Sushravya Raghunath, Alvaro E. Ulloa Cerna, Linyuan Jing, David P. vanMaanen, Joshua Stough, Christopher M. Haggerty & Brandon K. Fornwalt
Department of Computer Science, Bucknell University, Lewisburg, PA, USA
Joshua Stough
Phenomic Analytics and Clinical Data Core, Geisinger, Danville, PA, USA
Dustin N. Hartzel & Joseph B. Leader
Department of Population Health Sciences, Geisinger, Danville, PA, USA
H. Lester Kirchner
Tempus Labs, Inc., Chicago, IL, USA
Martin C. Stumpe, Ashraf Hafez, Arun Nemani, Tanner Carbonati & Kipp W. Johnson
Department of Internal Medicine, Geisinger, Danville, PA, USA
Katelyn Young
Heart Institute, Geisinger, Danville, PA, USA
Christopher W. Good, Amro Alsaid, Dominik Beer, Christopher M. Haggerty & Brandon K. Fornwalt
Heart and Vascular Center, Evangelical Hospital, Lewisburg, PA, USA
John M. Pfeifer
Department of Radiology, Geisinger, Danville, PA, USA
Aalpen A. Patel & Brandon K. Fornwalt
Department of Physiology and Cardiovascular Research Center, University of Kentucky, Lexington, KY, USA
Brian P. Delisle

Authors

Sushravya Raghunath
View author publications
You can also search for this author in PubMed Google Scholar
Alvaro E. Ulloa Cerna
View author publications
You can also search for this author in PubMed Google Scholar
Linyuan Jing
View author publications
You can also search for this author in PubMed Google Scholar
David P. vanMaanen
View author publications
You can also search for this author in PubMed Google Scholar
Joshua Stough
View author publications
You can also search for this author in PubMed Google Scholar
Dustin N. Hartzel
View author publications
You can also search for this author in PubMed Google Scholar
Joseph B. Leader
View author publications
You can also search for this author in PubMed Google Scholar
H. Lester Kirchner
View author publications
You can also search for this author in PubMed Google Scholar
Martin C. Stumpe
View author publications
You can also search for this author in PubMed Google Scholar
Ashraf Hafez
View author publications
You can also search for this author in PubMed Google Scholar
Arun Nemani
View author publications
You can also search for this author in PubMed Google Scholar
Tanner Carbonati
View author publications
You can also search for this author in PubMed Google Scholar
Kipp W. Johnson
View author publications
You can also search for this author in PubMed Google Scholar
Katelyn Young
View author publications
You can also search for this author in PubMed Google Scholar
Christopher W. Good
View author publications
You can also search for this author in PubMed Google Scholar
John M. Pfeifer
View author publications
You can also search for this author in PubMed Google Scholar
Aalpen A. Patel
View author publications
You can also search for this author in PubMed Google Scholar
Brian P. Delisle
View author publications
You can also search for this author in PubMed Google Scholar
Amro Alsaid
View author publications
You can also search for this author in PubMed Google Scholar
Dominik Beer
View author publications
You can also search for this author in PubMed Google Scholar
Christopher M. Haggerty
View author publications
You can also search for this author in PubMed Google Scholar
Brandon K. Fornwalt
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.R., C.M.H. and B.K.F. conceived the study and designed the experiments. S.R. conducted all the experiments. S.R., A.E.U.C. and J.S. contributed to the code base and deep learning framework used for the experiments. S.R., A.U.E.C., L.J., D.P.v.M., J.B.L. and D.N.H. assembled the data. H.L.K., B.P.D., A.A.P. and J.S. contributed to many discussions on experimental design. S.R. and D.P.v.M. designed the web application to perform the blinded surveys of cardiologists. C.W.G., J.M.P., A.A. and D.B. are the cardiologists who completed the survey and provided clinical insights. S.R., J.M.P., A.N., M.C.S., T.C., A.H. and K.W.J. contributed to the model interpretation with guided Grad-CAM. S.R., C.M.H. and K.Y. contributed to clinical chart review for cause-of-death analysis. All authors critically revised the manuscript.

Corresponding author

Correspondence to Brandon K. Fornwalt.

Ethics declarations

Competing interests

This work was supported in part by funding from the Pennsylvania Department of Health (SAP 4100070267), an American Heart Association Competitive Catalyst Award (17CCRG33700289), the Geisinger Health Plan and Clinic, and Tempus. The content of this article does not reflect the views of the funding sources. Geisinger receives funding from Tempus for ongoing development of predictive modeling technology and commercialization. Tempus and Geisinger have jointly applied for a patent related to the work. None of the Geisinger authors has ownership interest in any of the intellectual property resulting from the partnership. A.N., T.C., A.H., M.C.S. and K.W.J. are employees of Tempus.

Additional information

Peer review information Michael Basson was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Summary of the data used in the study.

Summary of data used in the study. Note that ‘15 traces’ means the standard 12 ‘short duration’ leads (2.5 seconds of voltage data for each) plus 3 ‘long duration’ leads (10 seconds of voltage data for each). PDF = portable document format. CV = cross-validation.

Extended Data Fig. 2 Model Architecture.

Model architecture used in the study.

Extended Data Fig. 3 Model performance as area under precision recall curve.

Summary of model performance as area under precision recall curve (AUPRC) to predict one-year mortality. (A) The mean AUPRC for the indicated input data, including (i) clinically-acquired ECG measures (9 numerical values and 31 diagnostic labels), (ii) ECG voltage-time traces only, (iii) age and sex alone, (iv) ECG measures with age and sex, and (v) ECG voltage-time traces with age and sex. Models for (i), (iii) & (iv) used XGBoost and models for (ii) & (v) used a DNN. ‘Normal’ refers to the ECGs in the test set labeled as normal by the original interpreting physician at the time of ECG acquisition, ‘abnormal’ refers to any ECGs not identified as normal in the test set and ‘all’ includes both normal and abnormal ECGs in the test set. (B) The relative performance of the DNN models using single leads as input (sorted by increasing performance). The mean AUPRC of the models M1-M5 (derived from 5-fold cross-validation, see text) are shown as the bar heights while individual data points for each of the 5 models are shown as a red ‘x’; black dots represent the AUPRC of model M0 (trained on 60% of the data and tested on the 40% holdout set). ‘2.5 seconds’ and ‘10 seconds’ refers to the duration of the voltage-time traces used for the model (see text for details).

Extended Data Fig. 4 Model explainability with GRAD-CAM.

The guided gradient class activation maps (guided Grad-CAM) overlaid on signal-averaged waveforms for three patients (bottom 3 rows) as well as mean signal and activation across patients (top row) for leads V2 and V3. Clinical ECG findings for all three patients reported anterior acute myocardial infarction with apparent ST segment elevations. Note that these patients were predicted high risk by the model, and all died within a year after this ECG (that is, they were considered ‘true positives’). The overlay of the saliency map from guided Grad-CAM highlights the regions deemed salient (darker red regions) by the model towards prediction of high likelihood of mortality in a year, which coincided with the ST segment.

Extended Data Fig. 5 Cardiologist visual survey.

Accuracy for the ten cardiologists to correctly identify the true positive ECG (dead within a year) when presented with two ‘normal’ ECGs corresponding to a paired set of a true positive and true negative (n=100) (gray bars). Accuracy is also shown (black bars) for the same survey after being shown an independent set of paired ECGs (n=100) with outcomes labeled. All ECG pairs presented were matched for age and sex.

Supplementary information

Supplementary Information

Supplementary Tables 1 and 2.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Raghunath, S., Ulloa Cerna, A.E., Jing, L. et al. Prediction of mortality from 12-lead electrocardiogram voltage data using a deep neural network. Nat Med 26, 886–891 (2020). https://doi.org/10.1038/s41591-020-0870-z

Download citation

Received: 29 March 2019
Accepted: 01 April 2020
Published: 11 May 2020
Issue Date: June 2020
DOI: https://doi.org/10.1038/s41591-020-0870-z

This article is cited by

Adopting artificial intelligence in cardiovascular medicine: a scoping review
- Hisaki Makimoto
- Takahide Kohro
Hypertension Research (2024)
AI-enabled electrocardiography alert intervention and all-cause mortality: a pragmatic randomized clinical trial
- Chin-Sheng Lin
- Wei-Ting Liu
- Chin Lin
Nature Medicine (2024)
Cardiac arrhythmia classification with rejection of ECG recordings based on uncertainty estimation from deep neural networks
- Wenrui Zhang
- Xinxin Di
- Shenda Hong
Neural Computing and Applications (2024)
Artificial intelligence in cardiovascular diseases: diagnostic and therapeutic perspectives
- Xiaoyu Sun
- Yuzhe Yin
- Tianqi Huo
European Journal of Medical Research (2023)
Electrocardiogram-based deep learning algorithm for the screening of obstructive coronary artery disease
- Seong Huan Choi
- Hyun-Gye Lee
- Won Kyung Lee
BMC Cardiovascular Disorders (2023)