The mission

Early detection of left ventricular dysfunction — a weak heart pump — prevents morbidity and mortality. Detection has traditionally required expensive, complex imaging studies such as electrocardiograms (ECGs) or computerized tomography scans. Recently, artificial intelligence (AI) models have been applied to 12-lead ECGs (AI-ECGs) acquired in medical settings to detect cardiac dysfunction1,2. We sought to massively scale up the use of the AI-ECG by applying it to signals recorded using the Apple watch, a device that provides consumer-grade wearable ECG data, recorded by patients in non-clinical environments. This required a prospective and decentralized study that used crowdsourcing techniques to collect ECGs recorded by patients with minimal resources in realistic settings, and transformation of the neural network designed for 12-lead ECGs to effectively classify outputs from the nosier, highly filtered, single-lead watch ECGs acquired in variable body positions and environments.

The solution

Our team created an iPhone app that securely and remotely collected Apple watch ECGs using Apple HealthKit. ECGs were recorded and transmitted by patients with or without heart disease using their own watches and phones. All patients consented to participate in the study remotely using a digital platform, and study recruitment was managed by a single study coordinator. Using the study app, we enrolled 2,454 patients at the Mayo Clinic from 46 US states and 11 countries, who transmitted 125,610 watch ECGs in 5 months, demonstrating the ability of this tool to inexpensively acquire massive data. Watch ECGs were uploaded to a dashboard where they could be reviewed by clinicians and were also made available for AI analysis. We created a single-lead version of our previously published and validated cardiac-dysfunction 12-lead model by filtering the 12-lead ECGs to mimic Apple watch recordings and then retraining the network. We tested the watch AI-ECGs using tracings from a subgroup of patients who had a clinically indicated ECG that measured ejection fraction (heart pump strength) for the ‘ground truth’.

Patients were very engaged with the study (Fig. 1), with most using the app every 2 weeks (when prompts were sent) and continuing to contribute ECGs throughout the study period. The watch AI-ECGs effectively detected cardiac dysfunction, defined as an ejection fraction of 40% or less, with an area under the receiver operating characteristic value of 0.88, which is clinically significant and exceeds available screening tests.

Fig. 1: Patient engagement in the study.
figure 1

a, Total ECGs recorded throughout the study. The first approximately 2 months of the trial included only friends and family of the researchers, to assess system function (Soft launch; left downward arrow); this was followed by the official launch of the app (right downward arrow). b, Length of use of the app, in days. c, Unique daily uses per patient (multiple uses on the same day counted as single use). d, Duration of app use by participants, normalized to control for the varying amount of time that participants had access to the app, defined as follows: normalized time = (date of last upload – date of first upload) / (date of study end – date of first upload). © 2022, Attia, Z. I. et al.

The implications

In a world in which AI models have superhuman abilities3 and health sensors are ubiquitous, there is a real potential to identify cardiac disease early via home monitoring with methods that are more continuous and less obtrusive than current screening approaches. This may identify people who can benefit from established therapies, and those who do not require medical care, to better utilize overburdened healthcare resources. Coupled with this opportunity is an obligation to rigorously test AI in real-world scenarios to demonstrate effectiveness before its use at scale4. Using a digital, remote study, we showed that an AI-ECG model can be adapted to effectively classify real-world Apple watch data. In addition to validating a specific model, we found that the use of a mobile-phone app is an affordable, fast and reliable method of collecting data, and that patients of all ages (ranging from 18 years to 90 years) remain highly engaged.

A potential limitation of our study is the cost of the Apple watch. Although these devices are becoming more affordable and their use is growing, their medical application may exacerbate healthcare inequities. However, distributing watches to clinics in underserved environments for use as a shared resource may permit very cost-effective screening. A second limitation is the limited racial diversity of the study population. Although the original 12-lead AI-ECG model has been found to be robust across races and ethnicities5, this needs to be assessed for the watch ECG as well.

We intend to expand the collection of ECGs from mobile devices made by any vendor that allows access to data (Apple watches were used due to open access to raw ECGs available via HealthKit; Apple was not aware of the study until its completion and provided no support) and to allow any patient to upload their data to the AI-ECG dashboard, to enable prospective screening of ventricular dysfunction and other diseases shown to be detectable by an AI-ECG in practice, and to facilitate access to care for patients in rural or resource-constrained areas.

Zachi I. Attia and Paul A. Friedman, Mayo Clinic, Rochester, MN, USA.

Expert opinion

“The biggest strength of this study is the proof of concept that AI algorithms developed to detect a variety of conditions on 12-lead ECGs may be modified and extrapolated to consumer wearable ECG recordings with relatively good accuracy. The study also highlights the process of enrolling patients in similar studies digitally and how institutional digital health products such as apps and dashboards can be very pragmatic for research and clinical care.” Mohamed Elshazly, Medical University of South Carolina, Charleston, SC, USA.

Behind the paper

This work grew out of the pressing need for tests to assess heart pump function, which has grown in importance, resulting in considerable demand and delays in access to ECG testing globally. Simultaneously, the COVID-19 pandemic underscored the importance of remote care, particularly in identifying diseases early and ahead of planned visits, to optimize the efficiency of in-person care. Thus, we sought to adapt our recently developed 12-lead AI-ECG for ventricular dysfunction to work with consumer watches already owned by patients, to shift diagnosis from the clinic to the home, and to make the tool widely accessible, massively scalable, inexpensive and actionable. P.A.F. and Z.I.A.

From the editor

“Although smartwatches have previously been tested for the detection of atrial fibrillation, a type of cardiac arrhythmia, this new, early-stage study demonstrates that smartwatches can be used to identify people with diminished heart function and can potentially serve as an early-warning system for heart failure. In this system, a smartphone app reports back to doctors at the Mayo Clinic, flagging people identified by the smartwatch as having impaired cardiac contractility.” Editorial Team, Nature Medicine.