Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Big data analytics to improve cardiovascular care: promise and challenges

Key Points

  • The availability of big data analytical tools for use in cardiovascular practice and research will grow rapidly

  • Big data analytical applications, such as predictive models for patient risk and resource use, have great potential to improve cardiovascular quality of care and patient outcomes

  • Big data analytical tools in cardiovascular care are still at a nascent stage of development and evaluation, and evidence showing they improve quality of care and patient outcomes is lacking

  • Establishing the 'evidence base' for big data applications in relation to cardiovascular quality and outcomes of care is critical; big data analytical tools should be evaluated as health-care delivery interventions

  • Big data methods are tolerant of poor quality of underlying data; however, big data tools might be more valid and clinically useful in cardiovascular care when based on higher quality data

  • Substantial attention and resources will be required to integrate big data analytical applications optimally into cardiovascular practice, and to monitor their effect on care and outcomes


The potential for big data analytics to improve cardiovascular quality of care and patient outcomes is tremendous. However, the application of big data in health care is at a nascent stage, and the evidence to date demonstrating that big data analytics will improve care and outcomes is scant. This Review provides an overview of the data sources and methods that comprise big data analytics, and describes eight areas of application of big data analytics to improve cardiovascular care, including predictive modelling for risk and resource use, population management, drug and medical device safety surveillance, disease and treatment heterogeneity, precision medicine and clinical decision support, quality of care and performance measurement, and public health and research applications. We also delineate the important challenges for big data applications in cardiovascular care, including the need for evidence of effectiveness and safety, the methodological issues such as data quality and validation, and the critical importance of clinical integration and proof of clinical utility. If big data analytics are shown to improve quality of care and patient outcomes, and can be successfully implemented in cardiovascular practice, big data will fulfil its potential as an important component of a learning health-care system.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Health-care system today.
Figure 2: Overview of big data analytics and applications.
Figure 3: Challenges for big data applications in cardiovascular care.


  1. 1

    Krumholz, H. M. Outcomes research: generating evidence for best practice and policies. Circulation 118, 309–318 (2008).

    Article  Google Scholar 

  2. 2

    Lampropulos, J. F. et al. Most important outcomes research papers on variation in cardiovascular disease. Circ. Cardiovasc. Qual. Outcomes 6, e9–e16 (2013).

    Article  Google Scholar 

  3. 3

    Fisher, E. S. et al. The implications of regional variations in Medicare spending. Part 1: the content, quality, and accessibility of care. Ann. Intern. Med. 138, 273–287 (2003).

    Article  Google Scholar 

  4. 4

    Fisher, E. S. et al. The implications of regional variations in Medicare spending. Part 2: health outcomes and satisfaction with care. Ann. Intern. Med. 138, 288–298 (2003).

    Article  Google Scholar 

  5. 5

    Committee on the Learning Health Care System in America. Best Care at Lower Cost: The Path to Continuously Learning Health Care in America (National Academies Press, 2013).

  6. 6

    Raghupathi, W. & Raghupathi, V. Big data analytics in healthcare: promise and potential. Health Inf. Sci. Syst. 2, 3 (2014).

    Article  Google Scholar 

  7. 7

    Bates, D. W., Saria, S., Ohno-Machado, L., Shah, A. & Escobar, G. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff. (Millwood) 33, 1123–1131 (2014).

    Article  Google Scholar 

  8. 8

    Krumholz, H. M. Big data and new knowledge in medicine: the thinking, training, and tools needed for a learning health system. Health Aff. (Millwood) 33, 1163–1170 (2014).

    Article  Google Scholar 

  9. 9

    Ginsberg, J. et al. Detecting influenza epidemics using search engine query data. Nature 457, 1012–1014 (2009).

    CAS  Article  Google Scholar 

  10. 10

    Butler, D. When Google got flu wrong. Nature 494, 155–156 (2013).

    CAS  Article  Google Scholar 

  11. 11

    Roski, J., Bo-Linn, G. W. & Andrews, T. A. Creating value in health care through big data: opportunities and policy implications. Health Aff. (Millwood) 33, 1115–1122 (2014).

    Article  Google Scholar 

  12. 12

    Weber, G. M., Mandi, K. D. & Kohane, I. S. Finding the missing link for big biomedical data. JAMA 311, 2479–2480 (2014).

    CAS  PubMed  Google Scholar 

  13. 13

    Sladojevic´, M. et al. Data mining approach for in-hospital treatment outcome in patients with acute coronary syndrome. Med. Pregl. 68, 157–161 (2015).

    Article  Google Scholar 

  14. 14

    Lee, J. & Maslove, D. M. Customization of a severity of illness score using local electronic medical record data. J. Intensive Care Med. (2015).

  15. 15

    Panahiazar, M., Taslimitehrani, V., Pereira, N. & Pathak, J. Using EHRs and machine learning for heart failure survival analysis. Stud. Health Technol. Inform. 216, 40–44 (2015).

    PubMed  PubMed Central  Google Scholar 

  16. 16

    Escobar, G. J. et al. Early detection of impending physiologic deterioration among patients who are not in intensive care: development of predictive models using data from an automated electronic medical record. J. Hosp. Med. 7, 388–395 (2012).

    Article  Google Scholar 

  17. 17

    Churpek, M. M., Yuen, T. C., Park, S. Y., Gibbons, R. & Edelson, D. P. Using electronic health record data to develop and validate a prediction model for adverse outcomes in the wards*. Crit. Care Med. 42, 841–848 (2014).

    Article  Google Scholar 

  18. 18

    Melillo, P., Orrico, A., Scala, P., Crispino, F. & Pecchia, L. Cloud-based smart health monitoring system for automatic cardiovascular and fall risk assessment in hypertensive patients. J. Med. Syst. 39, 294 (2015).

    Article  Google Scholar 

  19. 19

    Murff, H. J. et al. Automated identification of postoperative complications within an electronic medical record using natural language processing. JAMA 306, 848–855 (2011).

    CAS  PubMed  Google Scholar 

  20. 20

    Melillo, P. et al. Automatic prediction of cardiovascular and cerebrovascular events using heart rate variability analysis. PLoS ONE 10, e0118504 (2015).

    Article  Google Scholar 

  21. 21

    Dai, W. et al. Prediction of hospitalization due to heart diseases by supervised learning methods. Int. J. Med. Inform. 84, 189–197 (2015).

    Article  Google Scholar 

  22. 22

    Amarasingham, R. et al. Electronic medical record-based multicondition models to predict the risk of 30 day readmission or death among adult medicine patients: validation and comparison to existing models. BMC Med. Inform. Decis. Mak. 15, 39 (2015).

    Article  Google Scholar 

  23. 23

    Amarasingham, R. et al. An automated model to identify heart failure patients at risk for 30-day readmission or death using electronic medical record data. Med. Care 48, 981–988 (2010).

    Article  Google Scholar 

  24. 24

    Bayati, M. et al. Data-driven decisions for reducing readmissions for heart failure: general methodology and case study. PLoS ONE 9, e109264 (2014).

    Article  Google Scholar 

  25. 25

    Hu, Z. et al. Real-time web-based assessment of total population risk of future emergency department utilization: statewide prospective active case finding study. Interact. J. Med. Res. 4, e2 (2015).

    Article  Google Scholar 

  26. 26

    Hao, S. et al. Risk prediction of emergency department revisit 30 days post discharge: a prospective study. PLoS ONE 9, e112944 (2014).

    Article  Google Scholar 

  27. 27

    Hu, Z. et al. Online prediction of health care utilization in the next six months based on electronic health record information: a cohort and validation study. J. Med. Internet Res. 17, e219 (2015).

    Article  Google Scholar 

  28. 28

    Burwell, S. M. Setting value-based payment goals — HHS efforts to improve U.S. health care. N. Engl. J. Med. 372, 897–899 (2015).

    CAS  Article  Google Scholar 

  29. 29

    Tay, D., Poh, C. L. & Kitney, R. I. A novel neural-inspired learning algorithm with application to clinical risk prediction. J. Biomed. Inform. 54, 305–314 (2015).

    Article  Google Scholar 

  30. 30

    Makam, A. N., Nguyen, O. K., Moore, B., Ma, Y. & Amarasingham, R. Identifying patients with diabetes and the earliest date of diagnosis in real time: an electronic health record case-finding algorithm. BMC Med. Inform. Decis. Mak. 13, 81 (2013).

    Article  Google Scholar 

  31. 31

    Yang, H. & Garibaldi, J. M. A hybrid model for automatic identification of risk factors for heart disease. J. Biomed. Inform. 58, S171–S182 (2015).

    Article  Google Scholar 

  32. 32

    Jonnagaddala, J. et al. Identification and progression of heart disease risk factors in diabetic patients from longitudinal electronic health records. Biomed Res. Int. 2015, 636371 (2015).

    Article  Google Scholar 

  33. 33

    Wang, Y. et al. NLP based congestive heart failure case finding: a prospective analysis on statewide electronic medical records. Int. J. Med. Inform. 84, 1039–1047 (2015).

    Article  Google Scholar 

  34. 34

    Vijayakrishnan, R. et al. Prevalence of heart failure signs and symptoms in a large primary care population identified through the use of text and data mining of the electronic health record. J. Card. Fail. 20, 459–464 (2014).

    Article  Google Scholar 

  35. 35

    Lillo-Castellano, J. M. et al. Symmetrical compression distance for arrhythmia discrimination in cloud-based big-data services. IEEE J. Biomed. Health Inform. 19, 1253–1263 (2015).

    CAS  Article  Google Scholar 

  36. 36

    Vilar, S., Lorberbaum, T., Hripcsak, G. & Tatonetti, N. P. Improving detection of arrhythmia drug–drug interactions in pharmacovigilance data through the implementation of similarity-based modeling. PLoS ONE 10, e0129974 (2015).

    Article  Google Scholar 

  37. 37

    Jiang, G., Liu, H., Solbrig, H. R. & Chute, C. G. Mining severe drug–drug interaction adverse events using Semantic Web technologies: a case study. BioData Min. 8, 12 (2015).

    Article  Google Scholar 

  38. 38

    Resnic, F. S. et al. Automated surveillance to detect postprocedure safety signals of approved cardiovascular devices. JAMA 304, 2019–2027 (2010).

    CAS  Article  Google Scholar 

  39. 39

    Wang, G., Jung, K., Winnenburg, R. & Shah, N. H. A method for systematic discovery of adverse drug events from clinical notes. J. Am. Med. Inform. Assoc. 22, 1196–1204 (2015).

    Article  Google Scholar 

  40. 40

    Platt, R. et al. The U.S. Food and Drug Administration's Mini-Sentinel program: status and direction. Pharmacoepidemiol. Drug Saf. 21 (Suppl. 1), 1–8 (2012).

    PubMed  Google Scholar 

  41. 41

    Altman, R. B. & Ashley, E. A. Using 'big data' to dissect clinical heterogeneity. Circulation 131, 232–233 (2015).

    Article  Google Scholar 

  42. 42

    Shah, S. J. et al. Phenomapping for novel classification of heart failure with preserved ejection fraction. Circulation 131, 269–279 (2015).

    Article  Google Scholar 

  43. 43

    Shivade, C. et al. A review of approaches to identifying patient phenotype cohorts using electronic health records. J. Am. Med. Inform. Assoc. 21, 221–230 (2014).

    Article  Google Scholar 

  44. 44

    Kent, D. M. & Hayward, R. A. Limitations of applying summary results of clinical trials to individual patients: the need for risk stratification. JAMA 298, 1209–1212 (2007).

    CAS  Article  Google Scholar 

  45. 45

    Murdoch, T. B. & Detsky, A. S. The inevitable application of big data to health care. JAMA 309, 1351–1352 (2013).

    CAS  Article  Google Scholar 

  46. 46

    Longhurst, C. A., Harrington, R. A. & Shah, N. H. A 'green button' for using aggregate patient data at the point of care. Health Aff. (Millwood) 33, 1229–1235 (2014).

    Article  Google Scholar 

  47. 47

    Masoudi, F. A. & Rumsfeld, J. in Braunwald's Heart Disease: A Textbook of Cardiovascular Medicine 10th edn (eds Mann, D. L. et al.) 43–48 (Elsevier Saunders, 2015).

    Google Scholar 

  48. 48

    Meystre, S. M. et al. Heart failure medications detection and prescription status classification in clinical narrative documents. Stud. Health Technol. Inform. 216, 609–613 (2015).

    PubMed  PubMed Central  Google Scholar 

  49. 49

    Parsons, A., McCullough, C., Wang, J. & Shih, S. Validity of electronic health record-derived quality measurement for performance monitoring. J. Am. Med. Inform. Assoc. 19, 604–609 (2012).

    Article  Google Scholar 

  50. 50

    Ayers, J. W., Ribisl, K. M. & Brownstein, J. S. Tracking the rise in popularity of electronic nicotine delivery systems (electronic cigarettes) using search query surveillance. Am. J. Prev. Med. 40, 448–453 (2011).

    Article  Google Scholar 

  51. 51

    Coull, B. A. et al. Part 1. Statistical learning methods for the effects of multiple air pollution constituents. Res. Rep. Health Eff. Inst. 183, 5–50 (2015).

    Google Scholar 

  52. 52

    Margolis, R. et al. The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data. J. Am. Med. Inform. Assoc. 21, 957–958 (2014).

    Article  Google Scholar 

  53. 53

    Denaxas, S. C. et al. Data resource profile: cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER). Int. J. Epidemiol. 41, 1625–1638 (2012).

    Article  Google Scholar 

  54. 54

    Tu, J. V. et al. The Cardiovascular Health in Ambulatory Care Research Team (CANHEART): using big data to measure and improve cardiovascular health and healthcare services. Circ. Cardiovasc. Qual. Outcomes 8, 204–212 (2015).

    Article  Google Scholar 

  55. 55

    Wallace, P. J. et al. Optum Labs: building a novel node in the learning health care system. Health Aff. (Millwood) 33, 1187–1194 (2014).

    Article  Google Scholar 

  56. 56

    Curtis, L. H., Brown, J. & Platt, R. Four health data networks illustrate the potential for a shared national multipurpose big-data network. Health Aff. (Millwood) 33, 1178–1186 (2014).

    Article  Google Scholar 

  57. 57

    Fleurence, R. L., Beal, A. C., Sheridan, S. E., Johnson, L. B. & Selby, J. V. Patient-powered research networks aim to improve patient care and health research. Health Aff. (Millwood) 33, 1212–1219 (2014).

    Article  Google Scholar 

  58. 58

    Thompson, S. G. & Willeit, P. U. K. Biobank comes of age. Lancet 386, 509–510 (2015).

    Article  Google Scholar 

  59. 59

    Gottesman, O. et al. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future. Genet. Med. 15, 761–771 (2013).

    Article  Google Scholar 

  60. 60

    Shah, N. H. et al. Proton pump inhibitor usage and the risk of myocardial infarction in the general population. PLoS ONE 10, e0124653 (2015).

    Article  Google Scholar 

  61. 61

    Takada, M., Fujimoto, M., Yamazaki, K., Takamoto, M. & Hosomi, K. Association of statin use with sleep disturbances: data mining of a spontaneous reporting database and a prescription database. Drug Saf. 37, 421–431 (2014).

    CAS  Article  Google Scholar 

  62. 62

    Klimek, P., Kautzky-Willer, A., Chmiel, A., Schiller-Frühwirth, I. & Thurner, S. Quantification of diabetes comorbidity risks across life using nation-wide big claims data. PLoS Comput. Biol. 11, e1004125 (2015).

    Article  Google Scholar 

  63. 63

    Larson, E. B. Building trust in the power of 'big data' research to serve the public good. JAMA 309, 2443–2444 (2013).

    CAS  Article  Google Scholar 

  64. 64

    Richesson, R. L. et al. Electronic health records based phenotyping in next-generation clinical trials: a perspective from the NIH Health Care Systems Collaboratory. J. Am. Med. Inform. Assoc. 20, e226–e231 (2013).

    Article  Google Scholar 

  65. 65

    Amarasingham, R. et al. Allocating scarce resources in real-time to reduce heart failure readmissions: a prospective, controlled study. BMJ Qual. Saf. 22, 998–1005 (2013).

    Article  Google Scholar 

  66. 66

    Halamka, J. D. Early experiences with big data at an academic medical center. Health Aff. (Millwood) 33, 1132–1138 (2014).

    Article  Google Scholar 

  67. 67

    Amarasingham, R., Patzer, R. E., Huesch, M., Nguyen, N. Q. & Xie, B. Implementing electronic health care predictive analytics: considerations and challenges. Health Aff. (Millwood) 33, 1148–1154 (2014).

    Article  Google Scholar 

  68. 68

    Narula, J. Are we up to speed?: from big data to rich insights in CV imaging for a hyperconnected world. JACC Cardiovasc. Imaging 6, 1222–1224 (2013).

    Article  Google Scholar 

  69. 69

    Gray, E. A. & Thorpe, J. H. Comparative effectiveness research and big data: balancing potential with legal and ethical considerations. J. Comp. Eff. Res. 4, 61–74 (2015).

    Article  Google Scholar 

  70. 70

    Neff, G. Why big data won't cure us. Big Data 1, 117–123 (2013).

    Article  Google Scholar 

  71. 71

    Wessler, B. S. et al. Clinical prediction models for cardiovascular disease: tufts predictive analytics and comparative effectiveness clinical prediction model database. Circ. Cardiovasc. Qual. Outcomes 8, 368–375 (2015).

    Article  Google Scholar 

  72. 72

    Salisbury, A. C. & Spertus, J. A. Realizing the potential of clinical risk prediction models: where are we now and what needs to change to better personalize delivery of care? Circ. Cardiovasc. Qual. Outcomes 8, 332–334 (2015).

    Article  Google Scholar 

  73. 73

    Bottle, A., Gaudoin, R., Goudie, R., Jones, S. & Aylin, P. Can valid and practical risk-prediction or casemix adjustment models, including adjustment for comorbidity, be generated from English hospital administrative data (Hospital Episode Statistics)? A national observational study. Health Serv. Deliv. Res. 2, 40 (2014).

    Article  Google Scholar 

  74. 74

    Fihn, S. D. et al. Insights from advanced analytics at the Veterans Health Administration. Health Aff. (Millwood) 33, 1203–1211 (2014).

    Article  Google Scholar 

Download references

Author information




J.S.R. researched data for the article and made substantial contributions to the discussion of content. J.S.R. and T.M.M. wrote the manuscript, and J.S.R., K.E.J., and T.M.M. reviewed and edited the manuscript before submission.

Corresponding author

Correspondence to John S. Rumsfeld.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rumsfeld, J., Joynt, K. & Maddox, T. Big data analytics to improve cardiovascular care: promise and challenges. Nat Rev Cardiol 13, 350–359 (2016).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing