Heart age estimated using explainable advanced electrocardiography

Electrocardiographic (ECG) Heart Age conveying cardiovascular risk has been estimated by both Bayesian and artificial intelligence approaches. We hypothesised that explainable measures from the 10-s 12-lead ECG could successfully predict Bayesian 5-min ECG Heart Age. Advanced analysis was performed on ECGs from healthy subjects and patients with cardiovascular risk or proven heart disease. Regression models were used to predict patients’ Bayesian 5-min ECG Heart Ages from their standard, resting 10-s 12-lead ECGs. The difference between 5-min and 10-s ECG Heart Ages were analyzed, as were the differences between 10-s ECG Heart Age and the chronological age (the Heart Age Gap). In total, 2,771 subjects were included (n = 1682 healthy volunteers, n = 305 with cardiovascular risk factors, n = 784 with cardiovascular disease). Overall, 10-s Heart Age showed strong agreement with the 5-min Heart Age (R2 = 0.94, p < 0.001, mean ± SD bias 0.0 ± 5.1 years). The Heart Age Gap was 0.0 ± 5.7 years in healthy individuals, 7.4 ± 7.3 years in subjects with cardiovascular risk factors (p < 0.001), and 14.3 ± 9.2 years in patients with cardiovascular disease (p < 0.001). Heart Age can be accurately estimated from a 10-s 12-lead ECG in a transparent and explainable fashion based on known ECG measures, without deep neural network-type artificial intelligence techniques. The Heart Age Gap increases markedly with cardiovascular risk and disease.

www.nature.com/scientificreports/ measures of T-wave complexity based on singular value decomposition and signal averaging of the T wave. After being trained on a set of healthy individuals, this approach was clinically validated by describing an increased difference between Heart Age and chronological age both for subjects at risk of cardiovascular disease, and those with established cardiovascular disease 9 . However, importantly, reliable measurement of beat-to-beat RR and QT interval variability requires ECG recordings of at least several minutes. In addition, accurate quantification of high frequency QRS root mean squared voltages requires higher-fidelity recordings with more specialised equipment and sampling rates exceeding the 250-500 samples per second used today in most ECG machines for standard, 10-s ECG recordings. Therefore, the main aim of this study was to ascertain whether 5-min ECG Heart Age could be accurately predicted by only those measures available from standard 10-s 12-lead ECGs. Furthermore, a secondary aim was to also compare 10-s ECG Heart Ages to chronological ages in healthy subjects, subjects with cardiovascular risk factors, and patients with established cardiovascular disease. We hypothesised that standard-fidelity, 10-s 12-lead ECG recordings could accurately predict Bayesian ECG Heart Ages derived from higher-fidelity, 5-min 12-lead ECG recordings.

Methods
A pre-existing database of de-identified subjects with both 5-min and 10-s 12-lead ECG recordings was utilised for the study 9,18 . Within that database, healthy individuals, patients at cardiovascular risk, and patients with established cardiovascular disease were included. The healthy subjects, who had volunteered as asymptomatic controls, were recruited at Johnson Space Center (USA), the Universidad de los Andes (Venezuela), the University of Ljubljana hospitals and clinics (Slovenia), or Lund University Hospital (Sweden), as previously described in detail 18 . Patients with cardiovascular risk factors or established cardiovascular disease were recruited from cardiology clinics at either Texas Heart Institute (Houston, USA); the University of Texas Medical Branch (Galveston, USA), the University of Texas Health Sciences Center (San Antonio, USA), Brooke Army Medical Center (San Antonio, USA); St. Francis Hospital (Charleston, USA), the Universidad de los Andes (Mérida, Venezuela); and Lund University Hospital (Lund, Sweden).
All healthy subjects were low risk, asymptomatic volunteers with absence of any cardiovascular or systemic disease, based on clinical history and physical examination. Exclusion criteria for the healthy subjects included increased blood pressure at physical examination (≥ 140/90 mm Hg), treatment for hypertension or diabetes, or active smoking. Patients in the established cardiovascular disease group were included based on the presence of either coronary heart disease (determined by coronary angiography with at least one obstructed vessel (≥ 50%) in at least one major native coronary vessel or coronary graft, or, if coronary angiography was either unavailable or clinically not indicated, one or more reversible perfusion defects on 99 m-Tc-tetrofosmin single-photon emission computed tomography (SPECT) [20][21][22] ), left ventricular hypertrophy (LVH) based on imaging evidence of at least moderate, concentric wall thickening according to guidelines of the American Society of Echocardiography 23 , left ventricular systolic dysfunction (left ventricular ejection fraction ≤ 50%) at echocardiography, cardiac magnetic resonance imaging (CMR) or SPECT, or findings suggestive of dilated/hypertrophic/ischemic cardiomyopathy at echocardiography or CMR 18 . ECGs were acquired within 30 days of the cardiac imaging examination. Finally, subjects in the cardiovascular risk group were included based on the presence of cardiovascular risk factors such as hypertension or diabetes but no confirmed, established cardiovascular disease 9 .
Based on the above, three groups of study participants were identified: healthy subjects, subjects at cardiovascular risk, and patients with established cardiovascular disease. By methodological design, only healthy subjects were initially included when considering optimal measures of the 12-lead ECG available from 10-s recordings for predicting the 5-min ECG Heart Age. The 10-s ECG measures considered for the prediction model included: (1) From the conventional ECG: heart rate, R-to-R, P-wave, PR, QRS, QT, QTc, and TQ interval durations, as well as the conventional ECG amplitudes and axes; (2) From the transformation of the 12-lead ECG to the Frank X, Y and Z lead vectorcardiogram (VCG) via Kors' transform [24][25][26][27][28] : the spatial means and peaks QRS-T angles, the spatial ventricular gradient and its individual QRS and T components, the spatial QRS-and T-wave axes (azimuths and elevations), waveform amplitudes and areas, including those in the three individual vectorcardiographic planes, and spatial QRS-and T-wave velocities; and (3) Multiple measures of QRS-and T-wave waveform complexity based on singular value decomposition after signal averaging [29][30][31] .
For all study participants, the Heart Age Gap was calculated as the 10-s ECG Heart Age minus chronological age. All results were compared between the three groups, and for subgroups based on sex and age (above or below 60 years).
Each participant gave written informed consent. All recordings were obtained under Institutional Review Board (IRB) approvals from NASA's Johnson Space Center and partner hospitals that fall under IRB exemptions for previously collected and de-identified data. The study was performed in accordance with the Declaration of Helsinki.
Statistical analysis. Continuous variables were described using mean and standard deviation (SD). The chi-squared test was used to test for proportional differences between groups. Student's t test was used to compare group means.
The 10-s ECG Heart Age was initially derived only in the healthy subject group. First, standard least squares linear regression was used to identify the most promising univariable measures available from 10-s, standardfidelity ECG for predicting the 5-min ECG Heart Age. More than 20 such measures were identified, along with chronological age and sex, that had individual p < 0.0001. After the univariable linear regression identification procedure, stepwise, standard least squares multiple linear regression was then performed only on the selected measures to best predict, via multivariable model, the 5-min ECG Heart Age in the healthy subject group. The www.nature.com/scientificreports/ optimal multivariable model was defined as the most parsimonious one that maintained the highest possible R-squared value while at the same time maintaining all p < 0.0001 for the individual measures within the model in the healthy subject group, and at < 0.01 when the model was subsequently applied forward to the non-healthy (cardiovascular risk and disease) groups. Comparisons with 5-min ECG Heart Age are presented as scatter plots and Bland-Altman plots. A two-sided p-value of 0.05 was used as to define statistical significance. Statistical analyses were performed using SAS JMP version 11.0, SAS Institute Inc, Cary, NC, USA, and R version 3.5.3, R Foundation for Statistical Computing, Vienna, Austria, https:// www.R-proje ct. org/.

Results
In total, 2771 patients were included (n = 1682 healthy volunteers, n = 305 subjects with cardiovascular risk factors, and n = 784 with cardiovascular disease). Baseline characteristics are presented in Table 1, and average values for the ultimately included ECG measures are presented in Table 2. The ECG measures included in the final prediction models, and the intercept and coefficients for the regression equations are also presented in Table 3 for males and Table 4 for females. Agreement between the 10-s Heart Age and the 5-min Heart Age was excellent (R 2 = 0.94, p < 0.001, mean ± SD bias 0.0 ± 5.2 years), Fig. 1. Agreement was strong for both males and females (R 2 = 0.91, p < 0.001, and R 2 = 0.92, p < 0.001 respectively). In healthy subjects, on average, the Heart Age and chronological age were no different (0.0 ± 5.7 years), i.e. the Heart Age Gap was zero. In subjects with cardiovascular risk factors, the Heart Age Gap was larger (7.4 ± 7.3 years, p < 0.001). Patients with cardiovascular disease showed the largest Heart Age Gap (14.3 ± 9.2 years, p < 0.001 vs. subjects at cardiovascular risk), Fig. 2. This was evident both for females (Heart Age Gap: healthy subjects: 0.4 ± 5.8 years; cardiovascular risk: 7.8 ± 8.0 years; cardiovascular disease: 14.8 ± 9.0 years, p < 0.001) and for males (Heart Age Gap: healthy subjects: −0.3 ± 5.6 years; cardiovascular risk: 7.1 ± 6.6 years; cardiovascular disease: 14.0 ± 9.3 years, p < 0.001). A similar pattern was observed for younger (age < 60 years) individuals (n = 2142; Heart Age Gap: healthy subjects: 0.1 ± 5.7 years; cardiovascular risk: 7.8 ± 6.4 years; cardiovascular disease: 15.4 ± 9.8 years, p < 0.001) and for older individuals (n = 629; Heart Age Gap: healthy subjects: −1.7 ± 6.8 years; cardiovascular risk: 7.8 ± 6.4 years; cardiovascular disease: 15.4 ± 9.8 years). www.nature.com/scientificreports/

Discussion
We show that Heart Age can be estimated from standard-fidelity, conventional 10-s ECG recordings without requiring more specialised, higher-fidelity, 5-min long recordings. Consequently, ECG-based Heart Age as estimated using Bayesian techniques can now be more readily implemented in clinical practice. ECG-based estimation of Heart Age, as a marker of increased cardiovascular risk, has the advantage of being easily understood by the patient. It might also provide strong incentives for lifestyle changes or compliance to medication, similar to the reporting of "lung age" in the context of smoking cessation 16 . For example, in an asymptomatic patient with newly diagnosed hypertension, a value of 160/85 mm Hg may be difficult for the patient to understand in relation to future cardiovascular risk, and the communication of that risk may be difficult for the physician to translate. Instead, if this patient had a chronological age of 50 years but his Heart Age was 10 years higher, it may be more intuitive for the patient to understand that actions are needed. Additionally, an elevated blood pressure is merely a risk factor, and we do not know if the individual patient will suffer from the condition associated with the risk or not. By comparison, changes in the pattern of the ECG and advanced ECG often reflect cardiovascular end-organ changes. Some of these changes are potentially reversible, and this would move our understanding of risk one step beyond the reporting of traditional risk factors. This is line with the concept of "disease previvors", as described by Attia, et al., regarding the identification of pathological characteristics by subtle ECG findings in individuals with no manifest disease according to current diagnostic methods 32 .
In addition, we found Heart Age to be similar to chronological age in healthy individuals, while the Heart Age Gap was increasingly larger with increasing cardiovascular disease status. This suggests that Heart Age might also provide useful information for cardiovascular risk prediction, although validation in other datasets is necessary. Moreover, more tailored ECG approaches directly geared toward predicting outcomes, rather than indirectly geared toward predicting outcomes through a Heart Age per se, would also likely outperform strictly Heart Age-based attempts at risk stratification.  www.nature.com/scientificreports/ In the dataset used, healthy subjects were younger than subjects with cardiovascular risk factors or disease, and there was also a male preponderance in all three groups. The Heart Age Gap showed a similar pattern in both males and females, and in both younger (< 60 years) and older subjects.
Estimation of both the 5-min and 10-s Heart Age requires sinus rhythm, since P-wave duration is included in the related calculation. Thus, the assignment of a Heart Age to patients in atrial fibrillation is currently precluded. However, the risk associated with atrial fibrillation is well-established, and patients are often symptomatic. Consequently, the use of Heart Age to incrementally incentivise lifestyle interventions or improve compliance to medications may not be as necessary. Nonetheless, future studies to derive similar methods in patients with atrial fibrillation may also be of value.
Explainability and transparency of variables that contribute to ECG heart age. It is important to note that the patient's chronological age was purposely included in the estimation of Heart Age, just as in the original (5-min ECG) Bayesian model. The original model was not developed to predict a patient's actual chronological age, arguably a task of negligible clinical importance. Rather, the model was developed to estimate the "age" of the heart, and specifically how much that estimate differed from chronological age-quantified as the Heart Age Gap. A Bayesian statistical approach is predicated on knowing the patient's chronological age, with Heart Age being an adjustment of the chronological age based on electrocardiographic characteristics and sex. For an estimation of Heart Age to be accurate in predicting a Heart Age that is similar to the chronological age when the heart is healthy, and increased when the heart is diseased, it is desirable that the included ECG measures change with age, and that the change is augmented with increasing cardiovascular risk or disease severity.
The two ECG measures that had the strongest influence (highest t ratio) on the model were P-wave duration and spatial QT duration. This is not surprising, given that P-wave duration increases with age 33 , and increased P-wave duration can also be seen in advanced cardiovascular pathologies, e.g. heart failure and cardiac amyloidosis 34 . Similarly, QT duration increases with age 35 . Further, QT prolongation is associated with increased of cardiovascular risk, even beyond the rare long QT syndromes 36 , and with incrementally increased risk in advanced ages 37 . These general characteristics are also true for increased heart rate and for leftward shifting of the frontal plane QRS axis [38][39][40][41] .The other measures included in the score track changes in the vectorcardiographic QRS and T, and in T-wave complexity by singular value decomposition. Such changes are also known to occur in conditions associated with increased cardiovascular risk, such as hypertension and diabetes 42 , and in established cardiovascular disease, in which they often provide strong diagnostic and prognostic information [24][25][26][27][29][30][31] . Notably, these changes are not easily detectable by visual interpretation of a standard 12-lead ECG. How these ECG measures can affect the Heart Age is exemplified in Fig. 3. Taken together, the described ECG measures that contribute in a multivariable fashion to the Heart Age all have physiologically reasonable associations with age and disease in a way that is transparent to the assessing clinician, thus providing important explainability to the model.

Differences compared to Bayesian 5-min ECG heart age. The original, Bayesian 5-min ECG Heart
Age requires information from measures of beat-to-beat heart rate and QT variability 9 , and of the root-mean www.nature.com/scientificreports/ square voltage or other aspects of high-frequency (high fidelity) components of the QRS complex [43][44][45] . However, 10-s-duration recordings of standard fidelity do not allow for such measures, and therefore they were not included in the 10-s ECG Heart Age. However, unlike the original Bayesian 5-min ECG Heart Age, the 10-s ECG Heart Age should be derivable from any standard 12-lead ECG machine, as long as it is sufficiently equipped with software that can quantify the included measures and calculate the 10-s ECG Heart Age. The presented 10-s ECG Heart Age might therefore be anticipated to contribute to more widespread clinical penetration and use.
Comparison with other heart (or vascular) ages. The original 5-min ECG Heart Age was derived using supervised machine learning with Bayesian statistical methods, not related to deep neural networks (DNN). In turn, the 10-s ECG Heart Age was derived using supervised machine learning with multivariable linear regression analysis to predict 5-min ECG Heart Age. As discussed above, the ECG measures included in both of these methods are transparent and physiologically explainable. Other means of expressing a heart or vascular age have recently been published 15 , including the use of information in the 10-s 12-lead ECG to estimate ECG Heart Age based on DNN-type artificial intelligence (AI) 7,8,10 . For example, Attia, et al., showed that by using a DNN AI technique, a patient's chronological age could be predicted, and that if the difference between the predicted and actual age was small, prognosis was good 8 . When Heart Age according to Attia et al. was notably older than the chronological age, the risk of future death was increased 7 . Such results at least superficially correspond to the findings in our study that the Heart Age Gap increased with increasing burden of cardiovascular risk.
Other DNN-based AI methods, similar to that of Attia et al. also more recently reported similarly encouraging results 10,13,14 .  www.nature.com/scientificreports/ Although the results of such AI studies are promising, DNN-based AI techniques are nonetheless problematic in several respects, especially in relation to their lack of transparency and explainability inherent to the 'black box' of AI 46,47 .Without the ability to know the exact features of the 12-lead ECG that are most important in a given DNN model's output, both interpretability and ethical accountability are compromised 48 . Moreover, it is effectively impossible for a clinician to identify, when critically evaluating the diagnostic output of a DNN-based AI model, the contribution to the result from methodological artifact or bias merely related to noise or to differing technical specifications between different ECG machines 49 . Or the extent to which unanticipated results might merely relate to excess dependency on the particular characteristics of a given DNN AI model's training set 50 . In addition, a major flaw in all DNN-based AI ECG age models that we are aware of is that their age predictions were made using training datasets that also included individuals with both cardiovascular risk factors and established disease 8,10,13,14 . For ECG Heart Age to be used as a marker of potentially reversible cardiovascular disease and risk, it is imperative that ECG Heart Age agrees with chronological age in healthy populations, since it is the Figure 3. Example of the transparency and explainability of the Heart Age from two subjects with equal chronological age but different Heart Ages, illustrated by a commensurately aged female heart, but a disproportionately aged male heart. The ECG measures for each of the two patients are shown in the table in the middle of the figure, presented in the order of the relative strength (strongest first, based on t ratio [not shown]) of contribution to the Heart Age. Notably, P-wave duration is markedly different between these two patients, and heart rate is higher for the male than the female, helping drive the Heart Age higher in the male. The R-wave amplitude in lead Y is also much larger in the female, and the difference between the spatial ventricular gradient and the spatial mean QRS in the female is also larger, likely due to preserved T-wave amplitudes, contributing to her relatively younger Heart Age. Furthermore, possibly due to ischemic myocardial injuries, T-wave complexity is increased in the male, suggesting that increased myocardial repolarisation heterogeneity also contributes to driving Heart Age higher in the male, in spite of his shorter QT interval. www.nature.com/scientificreports/ deviation from the line of identity in this relationship that forms the basis for measurement-related accuracy and precision, and subsequent disease-related assessment and risk. Among the various proposed ECG Heart Ages in the literature 7,8,10,13,14 , the 5-min ECG Heart Age is the only ECG Heart Age that was trained using only healthy subjects 9 . This is a fundamentally important distinction between the different ECG Heart Age methods, and we argue that the ECG-based Heart Age Gap is correctly estimated only when the ECG Heart Age is trained using only healthy subjects. In future studies, a comparison of explainable ECG Heart Age with those from DNNbased AI models may be of value, in order to understand the differences both in accuracy and precision, and in diagnostic and prognostic performance. Hence, we believe that the pursuit of an ECG Heart Age developed from heart-healthy subjects of varying ages, but without a black-box DNN or related AI methodology is valuable, and that the present results provide sufficient confirmation of accuracy to encourage further development. Moreover, that the use of more transparent regression models will also increase the ability of clinicians to better understand the origin of any unexpected result, and to thereafter relay it to the patient with a more convincing sense of trust and accountability 48 . Finally, Heart Age can also now be retrospectively determined for clinical purposes or for retrospective scientific studies whenever raw digital data of acceptable quality are available from stored, standard 10-s ECG recordings.
Limitations. The dataset was not strictly divided into a training and test set. However, the fact that the selected ECG measures were derived only in the Healthy group, then applied forward to the other groups wherein the Heart Age Gap was incrementally larger with increasing cardiovascular risk and overt disease, does serve as a form of clinical validation. The prognostic value of the Heart Age Gap was not investigated and requires further evaluation in future studies. The use of three different levels of cardiovascular risk or disease status (healthy, cardiovascular risk without established disease, and established cardiovascular disease) might also be construed as somewhat arbitrary, although they do constitute a logical grouping with incrementally increasing risk from one group to the next. Future studies are nonetheless also required to validate any association between Heart Age and outcomes associated with different cardiovascular pathologies. Finally, although Heart Age was highly accurate, its precision could not be reliably defined in this study. These aspects therefore need to be addressed in future studies.