Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

# Arrhythmic sudden death survival prediction using deep learning analysis of scarring in the heart

### Subjects

An Author Correction to this article was published on 28 April 2022

## Abstract

Sudden cardiac death from arrhythmia is a major cause of mortality worldwide. In this study, we developed a novel deep learning (DL) approach that blends neural networks and survival analysis to predict patient-specific survival curves from contrast-enhanced cardiac magnetic resonance images and clinical covariates for patients with ischemic heart disease. The DL-predicted survival curves offer accurate predictions at times up to 10 years and allow for estimation of uncertainty in predictions. The performance of this learning architecture was evaluated on multi-center internal validation data and tested on an independent test set, achieving concordance indexes of 0.83 and 0.74 and 10-year integrated Brier scores of 0.12 and 0.14. We demonstrate that our DL approach, with only raw cardiac images as input, outperforms standard survival models constructed using clinical covariates. This technology has the potential to transform clinical decision-making by offering accurate and generalizable predictions of patient-specific survival probabilities of arrhythmic death over time.

## Main

Sudden cardiac death (SCD) continues to be a leading cause of mortality worldwide, with an incidence of 50–100 per 100,000 in the general population in Europe and North America1, and accounts for 15–20% of all deaths2. Patients with coronary artery disease are at the highest risk of arrhythmic sudden cardiac death (SCDA)3,4. Although implantable cardioverter devices (ICDs) effectively prevent SCD due to ventricular arrhythmias, current clinical criteria for ICD candidacy—that is, left ventricular ejection fraction (LVEF) <30–35%5—capture only a mere 20% all SCDAs6, highlighting the critical need to develop personalized, accurate and cost-effective arrhythmia risk assessment tools to mitigate this enormous public health and economic burden. Several studies have identified risk factors for SCDA, and many risk stratification approaches have attempted to transcend LVEF7,8. However, limitations in these approaches have been barriers to their clinical implementation. Previous attempts have broadly stratified populations based on subgroup risk, failing to customize predictions to patients’ unique clinical features9. SCDA risk has been typically assessed at pre-defined finite time points, ignoring the likely patient-specific time evolution of the disease10. Additionally, in previous work, confidence estimates for predictions have been ‘one-size-fits-all’, varying only by risk subgroup, thus preventing the identification of low-confidence, potentially highly erroneous prediction outliers11. Moreover, few previous studies have validated their results externally or comprehensively compared model performance to standard approaches. A robust, generalizable SCDA risk stratifier with the ability to predict individualized, patient-specific risk trajectories and confidence estimates could considerably enhance clinical decision-making. Finally, although arrhythmia arises, mechanistically, from the heterogeneous scar distribution in the disease-remodeled heart, machine learning the features of that distribution has not been explored for risk analysis. Image-derived mechanistic computational models of cardiac electrical function that incorporate scar distribution have proven successful in predicting arrhythmia risk12; however, they remain exceedingly computationally intensive. Therefore, computational models are impractical as a first-stage screening tool in a broad population. Using raw contrast-enhanced (late gadolinium enhancement (LGE)) cardiac images that visualize scar distribution in a DL framework, which additionally draws on standard clinical covariates, could overcome these limitations and lead to accurate patient-specific SCDA probabilities in fractions of a second.

Here we present a DL technology for prediction of SCDA risk in patients with ischemic heart disease. Our approach, which we term Survival Study of Cardiac Arrhythmia Risk (SSCAR), embeds, within a survival model, neural networks to estimate individual patient times to SCDA (TSCDA). The neural networks learn from raw clinical imaging data, which visualize heart-disease-induced scar distribution, as well as from clinical covariates. The predicted patient-specific survival curves offer accurate SCDA probabilities at all times up to 10 years. The performance and high generalizability of the approach are demonstrated by testing on an external cohort, after internal cross-validation. Our technology represents a fundamental change in the approach to arrhythmia risk assessment, as SSCAR uses the data to directly estimate uncertainty in its predictions. Therefore, SSCAR has the potential to considerably shape clinical decision-making regarding arrhythmia risk, offering not a simple ‘at-risk/not-at-risk’ prediction but, instead, an estimate of the time to SCDA together with a sense of ‘how certain’ the model is about each predicted TSCDA.

## Results

### SSCAR overview

The arrhythmia risk assessment algorithm in SSCAR is a DL framework that incorporates multiple custom neural networks (which fuse different data types), combined with statistical survival analysis, to predict patient-specific probabilities of SCDA at future time points. Figure 1 presents an overview of SSCAR. On the left and right, cardiac magnetic resonance (CMR) images and clinical covariates (yellow panel) are used as inputs to the two corresponding branches of the model. The goal of each of the branches is to predict the patient-specific survival curve. In the left branch, cardiac CMR images—visualizing the patients’ three-dimensional (3D) ventricle geometry and contrast-enhanced remodeled tissue—are used as input by a custom-designed encoder–decoder convolutional neural sub-network (red panel, left). This CMR sub-network is trained to reduce the dimension of the input (that is, encode) and to discover and extract imaging features associated with SCDA risk directly from the CMR images by learning and applying filters (that is, convolving). The encoder–decoder design of the sub-network ensures that resulting imaging features retain sufficient information to be able to reconstruct the original images (red panel, left, decoder path). In the right branch, the 22 clinical covariates in Table 1 are provided to a dense sub-network (green panel, right), which discovers and extracts non-linear relationships between the input variables. The outputs of the sub-networks are combined (ensembled) in a way that best fits the observed SCDA event training data (center path, dot-dashed) to estimate the most probable TSCDA and the uncertainty in the prediction. The output of the model is a per-patient cause-specific survival curve (bottom, blue).

### SSCAR overall risk prediction performance

SSCAR was developed and internally validated using data from 156 patients with ischemic cardiomyopathy (ICM) enrolled in the Left Ventricle Structural Predictors of Sudden Cardiac Death (LVSPSCD) prospective observational study11,13. SSCAR performance was evaluated comprehensively on this internal set using Harrell’s concordance index (c-index)14—range is [0, 1], higher scores are better—and the integrated Brier score ($$\overline{\,{{\mbox{Bs}}}\,}$$)15—range is [0, 1], lower scores are better. SSCAR has excellent concordance on the internal set (.82–.89) for all times up to 10 years (Fig. 2a). Additionally, the $$\overline{\,{{\mbox{Bs}}}\,}$$ ranges from .04 to 0.12, suggesting strong calibration, given the high concordance. The model maintains its risk discrimination abilities at all times, as further evidenced by the high areas under the receiver operator characteristic (ROC) curves evaluated at years 2–9 (Extended Data Fig. 1). All events up to 10 years are used to construct the cross-validated ROC and precision-recall (PR) curves for the internal validation set (Fig. 2b,c). The area under the ROC curve is 0.87 (95% confidence interval (CI): 0.84–0.90), whereas the area under the PR curve is 0.93 (95% CI: 0.91–0.95).

To demonstrate the model’s performance, an external test was performed using an independent case–control set of 113 patients with coronary heart disease selected from participants with available CMR images and the same list of covariates enrolled in the PRE-DETERMINE study16. These patients had less severe left ventricular systolic dysfunction but otherwise had similar inclusion/exclusion criteria to patients in the LVSPSCD study (Methods). Despite the dissimilarities between cohorts, SSCAR performance carries over well to the external cohort, resulting in a c-index of 0.71–0.77 and $$\overline{\,{{\mbox{Bs}}}\,}$$ of 0.03–0.14 (Fig. 2a; dashed lines). The area under the ROC curve is 0.72 (95% CI: 0.67–0.77), and the area under the PR curve is 0.73 (95% CI: 0.68–0.78), on the external set (Fig. 2b,c).

### Patient-specific survival curves predicted by SSCAR

The SSCAR survival model presented here predicts cause-specific survival curves for each patient through two individualized parameters—the location μ and scale σ—characterizing the probability distribution of TSCDA (Methods). Using deep neural networks to directly learn these parameters from CMR images and from clinical covariates in a way that best models the survival data produces highly individualized survival probability predictions. Extended Data Fig. 2a illustrates individualized cause-specific survival curves (solid, blue) for a patient with TSCDA around 6 years (left panel) and a patient censored (non-SCDA event) at around 7 years (right panel). In both cases, the survival curves estimated by SSCAR accurately predict the event probabilities: in the first case, the estimated survival probability crosses the 50% threshold close to the event time; in the censored case, SSCAR predicts more than 80% probability of survival at the time of the (non-SCDA) event. For reference, two commonly used survival curves are depicted: the Kaplan–Meier estimate (purple, dot-dashed) and the Breslow estimate based on a Cox proportional hazards model using the clinical covariates (green, dashed), demonstrating worse performance by underestimating the risk for the patient with SCDA and overestimating the risk for the censored patient. Further details on SSCAR’s internal performance compared to the Cox proportional hazards model are presented in Fig. 3.

The predicted location parameter estimates the most probable TSCDA, and the predicted scale parameter provides a measure of confidence for the location. The inclusion of both a location and a scale parameter in the model offers the advantage of building in uncertainty directly into the TSCDA prediction. Notably, this uncertainty is patient-specific and learned from data. Extended Data Fig. 2b presents examples of predicted TSCDA probability distributions for two patients (P1 and P2) with different scale parameters, visualized as the widths of the distributions. Shown are the actual (dotted) and predicted (solid) TSCDA as well as the probability distributions (shaded). For P1, the prediction error is small (solid versus dashed vertical lines), and the model is certain, as seen by the narrower probability distribution of P1’s TSCDA or, equivalently, a smaller predicted scale parameter. In the case of P2, the prediction error is larger, and the model predicts a wider distribution or, equivalently, a larger scale parameter, indicating higher uncertainty. Remarkably, using the entire internal cohort to quantify this direct relationship between prediction error—calculated as the relative mean absolute difference of actual and predicted times—and scale parameter reveals significant positive correlation (Pearson’s r = 0.42, P < 0.001), demonstrating that SSCAR recognizes which predictions of TSCDA will turn out inaccurate and ‘lowers the confidence’ in them through a larger scale parameter.

### Image-based risk prediction

The CMR sub-network (see Extended Data Fig. 3 for architecture details) in SSCAR integrates neural network DL on images within an overall statistical survival model. This branch of SSCAR uses LGE-CMR—a modality uniquely suited for visualizing ventricle geometry and portions of the myocardium with contrast-enhanced remodelling—to learn image features most useful in predicting a patient’s survival TSCDA. CMR raw pixel values from the automatically segmented left ventricle are directly provided to the network, eliminating the need for arbitrary thresholds aiming to delineate areas of enhancement. Using only images as inputs (Fig. 3 and Supplementary Table 1), SSCAR achieves 0.70 (95% CI: 0.67–0.72) c-index and 0.17 (95% CI: 0.167–0.178) $$\overline{\,{{\mbox{Bs}}}\,}$$ for event data truncated at 10 years on the internal validation set. On the external testing set, the CMR-only model achieves 0.63 (95% CI: 0.59–0.66) c-index and 0.19 (95% CI: 0.186–0.200) $$\overline{\,{{\mbox{Bs}}}\,}$$. It is noteworthy that, among the covariate sub-network’s 22 clinical covariates, it already includes manually engineered features from the CMR images. For example, infarct size—calculated as the percentage of left ventricular tissue deemed fibrotic using manual segmentation performed by trained experts—was among the 22 and, indeed, had considerable effect on lowering TSCDA. Despite including CMR-based features in the covariate network, the CMR sub-network (using only CMR as inputs) achieves similar performance to the covariate one (Fig. 3). Furthermore, ensembling the two sub-networks together leads to a significant increase in overall performance compared to using just the covariate-based one, demonstrating that the CMR sub-network identifies different CMR-based features than the manually engineered ones.

Imaging features learned by the CMR network can be interpreted using a gradient-based sensitivity analysis (Fig. 4a). The gradient here quantifies the effect on the predicted TSCDA of features identified by the CMR neural network, which are averaged per patient to form the gradient map (Methods). This map overlaid on the myocardium (right column, blue and red heat map) shows the degree of contribution of the local pixel intensity to the most probable TSCDA (that is, to the location parameter) for a patient without an SCDA event (top) and one with SCDA (bottom). Myocardial regions found to be characterized with large positive gradient (dark blue) are interpreted as having high importance in increasing TSCDA, and, conversely, regions with large magnitude negative gradient (dark red) represent areas that are responsible for decreasing the predicted TSCDA. The areas of contrast-enhanced myocardium (middle column, in brighter green) do not fully overlap with the gradient map, which suggests that, although features learned by the CMR neural network may co-localize with enhanced tissue, the algorithm does not act as a mere enhancement locator. For example, the patient who did not experience SCDA has contrast-enhanced tissue, but the effect of these regions is to increase the predicted TSCDA, suggesting a nuanced relationship between presence of enhancement and propensity of SCDA.

### Non-linear neural network for covariate data

SSCAR incorporates patient clinical covariate data (Table 1) through the use of a dense, multi-layer neural network (Fig. 1, green panel). This sub-network discovers and extracts potential non-linear relationships between the covariates and integrates them within SSCAR’s overall survival predictions. We demonstrate the utility of the sub-network by comparing its performance with a (linear) Cox proportional hazards model (Fig. 3). To avoid mis-attributing performance differences to the underlying statistical models, we consider an intermediary model that uses neural network feature extraction with a Cox proportional hazards model. Using clinical covariate data only, SSCAR with a Cox survival model (covariate only, Cox) outperforms the standard Cox proportional hazards model (linear Cox PH) in terms of c-index (0.73 versus 0.58, dark blue, left y axis), balanced accuracy (0.65 versus 0.45, mid-blue, left y axis), F-score (0.78 versus 0.69, light blue, left y axis) and $$\overline{\,{{\mbox{Bs}}}\,}$$ (0.14 versus 0.30, red, right y axis). We show that the neural network model maintains interpretability by performing a sensitivity analysis of the predicted TSCDA with respect to changes in the covariates (Fig. 4b). As above, high positive gradients (blue) denote covariates for which small increases in their values lead to large increases in TSCDA, whereas small negative gradients (red) represent covariates for which small increases lead to large decreases in TSCDA. The top four positive gradient covariates are LVEF computed from CMR, β-blocker medication, heart rate computed from electrocardiogram (ECG) and use of digoxin. The bottom four negative gradient covariates are left ventricular mass at end-diastole, use of diuretic medication, QRS duration computed from ECG and infarct size (%).

## Discussion

Here we present an approach to SCDA risk assessment, termed the SSCAR framework, which uses a deep neural network survival model to predict patient-specific survival curves in ischemic heart disease. SSCAR consists of two neural networks: (1) a 3D convolutional network learning on raw unsegmented for enhancement LGE-CMR images that visualize heart-disease-induced scar distribution and (2) a fully connected network operating on clinical covariates. SSCAR’s predicted patient-specific survival curves offer accurate SCDA probabilities at all times up to 10 years. SSCAR is not only a highly flexible model, able to capture complex imaging and non-imaging feature inter-dependencies, but is also robust owing to the statistical framework governing the way these features are combined to fit the survival data. Our framework predicts entire probability distributions for the TSCDA, allowing for uncertainties in predictions to be themselves patient-specific and learned from data, thereby equipping the model with a self-correction mechanism. This approach remedies a well-known important limitation of neural networks—the high confidence in erroneous predictions. SSCAR’s integration of deep neural network learning within a survival analysis and the resulting detailed outputs could represent a paradigm shift in the approach to SCDA risk assessment.

Despite many heralding DL as the arrival of the artificial intelligence (AI) age in personalized healthcare17,18,19,20,21, no considerable progress has so far been made using DL on contrast-enhanced cardiac images to assess arrhythmia risk. Although there have been non-DL efforts to incorporate clinical imaging-derived features in SCDA risk stratification22,23,24, these severely underuse the data, suffering from two main limitations: features often rely on time-consuming, manual processing steps, typically involving arbitrarily chosen image intensity thresholds; or features are either too coarse to capture the intricacies of the scar distribution or highly mathematical, undermining their physiological underpinning. On the other hand, the DL efforts related to arrhythmia have focused primarily on its cardiologist-level detection in ECG signals25,26,27,28,29. In the current work, we present a DL approach that takes as input directly raw, unsegmented LGE-CMR images and automatically identifies features that best model and predict the TSCDA.

SSCAR is an SCDA risk prediction model that combines raw imaging with other data types in the same DL framework. Our technology operates on LGE-CMR images and clinical covariates within a unified feature learning process, allowing for the different data types to synergistically inform the overall survival model. Among the clinical covariates used in SSCAR are standard manually derived imaging features, which prevents the CMR neural network from merely re-discovering these known features and, instead, encourages it to learn new features. SSCAR achieves performance that is beyond the state-of-the-art in both relative terms—SCDA risk ordering among patients—and absolute terms—accurately calibrated probabilities of SCDA. Our robust testing scheme overcomes important limitations of previous work on SCDA risk prediction10,16,22,23,30. First, we demonstrate high generalizibility by computing internal cross-validation performance numbers resulting from 100 train/test splits of the data and, notably, on an entirely separate external cohort, showing modest performance degradation. Second, our approach prevents the model from being over-tuned to a certain time horizon by computing performance metrics at multiple time points up to 10 years.

Because SSCAR is a combination of neural networks, each working on different data types (images and clinical covariates), we were able to perform a comprehensive bottom-up analysis of overall performance. We demonstrated that the added complexity of our DL approach—potentially at some expense to interpretability—is justified by the significantly elevated performance numbers. Indeed, we developed and evaluated a regularized Cox proportional hazards model using the available clinical covariates to serve as a baseline for the rest of the analysis. We showed that the neural-network-driven feature extraction of SSCAR on the same covariates performs significantly better in the same proportional hazards setting, highlighting the importance of non-linear relationships in the covariates. Furthermore, we showed that, even when using only LGE-CMR images to predict arrhythmia risk, the CMR neural network in SSCAR (1) outperforms the Cox proportional hazards model constructed using clinical covariates, which include standard imaging and non-imaging features, and (2) performs on par with the covariate-only network in SSCAR using the same clinical variables, suggesting that the image-only neural network in SSCAR is able to identify highly predictive imaging features in the LGE-CMR images. Finally, we demonstrate that the imaging features found by SSCAR’s CMR network cannot be explained away even when considering non-linear relationships between standard covariates, as evidenced by the ensembled SSCAR model superior performance over SSCAR using either data type.

Notably, a level of interpretability is embedded in the overall design of the custom neural network used in SSCAR. Interpretability of AI algorithms is paramount to their broad adoption, and concerns surrounding it are particularly prevalent in healthcare. In our approach, we take multiple steps to ensure the relevance and interpretability of resulting features. Our sensitivity analysis of the outputs to the extracted features offers a lens into the neural network, rendering some transparency to the algorithm ‘black-box’ (Fig. 4). In addition, CMR images taken as input by the CMR neural network are automatically segmented to include myocardium-only raw intensity values, and the network is designed as an encoder–decoder to ensure minimal loss of information during the feature extraction process.

SSCAR achieves strong performance despite working on a relatively small dataset. A concern with DL on smaller datasets is overfitting, which manifests itself as high performance during training (good fit) but poor performance when applied to a new test set. Indeed, the results in this paper show some differences between metrics on the internal validation and external test cohorts. However, we emphasize that, although the two cohorts’ covariates were ‘harmonized’ where possible (Methods), they represent two different distributions (for example, low versus moderately reduced LVEF, unmatched versus matched case–control and three versus 60 CMR acquisition sites), likely accounting for any performance differences in the two populations. Furthermore, several measures were taken to mitigate overfitting. In addition to standard techniques—dropout, kernel and bias regularizers—we designed the CMR sub-network as an encoder–decoder that uses the distilled features used in risk prediction to also re-construct the original image as an additional regularization technique. Finally, all numbers cited on the internal validation set are averages of the test performance of hundreds of train/test data splits, adding a layer of statistical rigor.

In SSCAR, we directly model the cause-specific hazard rate and use the implied survival function to make predictions. A potential shortcoming of models that do not directly model competing risks is that predicted probabilities for the event of interest assume a reality where no other type of death could occur, thereby potentially undermining interpretability. A limitation here is that we could not compute the cause-specific cumulative incidence function, as it requires additional all-cause mortality data as well as competing risk data (for example, revascularization data). However, should such data become available, our competing risk framework makes such an extension straightforward.

An additional limitation in this work is that the list of covariates is not comprehensive. Few standard clinical covariates were dropped when ‘harmonizing’ the internal and external cohorts (for example, all diuretic types were merged into one variable, and there were no angiotensin receptor-neprilysin inhibitor data). However, because no left ventricle standard imaging covariates were excluded, we do not expect any of the omitted variables to affect conclusions drawn regarding the performance of the sub-components of SSCAR relative to the baseline Cox model. Including additional covariates identified in past work as predictors of SCDA, but not part of standard clinical practice, was beyond the scope of our work. However, these could, in principle, erode the performance of the image-based feature extraction in SSCAR in favor of the covariate-only part. Nevertheless, we would expect that, in general, including more variables with proper regularization can only improve the overall results in SSCAR, even if a re-balance of its components’ performance contribution occurs. Similarly, including right ventricle CMR images and parameters and adjusting the methodology accordingly could help generalize SSCAR to more cardiomyopathies.

SSCAR fuses cutting-edge DL technology with modern survival analysis techniques. It represents innovation in CMR imaging feature extraction and learning of non-linear relationships among standard clinical covariates. The technology aims to transform clinical decision-making regarding arrhythmia risk and patient prognosis by encouraging practitioners to eschew the view of predicted risk as a single number outputted by a ‘black-box’ algorithm but, rather, to be guided by the estimated time-to-outcome in the context of patient-specific time prediction uncertainty, which is itself built in SSCAR’s learning process. Through its accurate predictions and considerable levels of generalizability and interpretability, SSCAR represents an essential step toward bringing patient trajectory prognostication into the age of AI.

## Methods

The research protocol used in this study was reviewed and approved by the Johns Hopkins University institutional review board and the Brigham and Women’s Hospital institutional review board. All participants provided informed consent to be part of the clinical studies described below. There was no participant compensation.

### Patient population and datasets

This study was a retrospective analysis based on a subset (n = 269) of patients selected from the prospective clinical trials described below using the process outlined in Extended Data Fig. 4. Of note is that the entire model development described in this manuscript was based on the internal cohort (see below), whereas the case–control external cohort was used exclusively for testing (outcomes were solely used for computing relevant metrics once the model was fixed).

#### LVSPSCD cohort (internal)

Patient data came from the LVSPSCD study (ClinicalTrials.gov ID NCT01076660) sponsored by Johns Hopkins University. As previously described11,13, patients satisfying clinical criteria for ICD therapy for SCDA (LVEF ≤35%) were enrolled at three sites: Johns Hopkins Medical Institutions (Baltimore, Maryland), Christiana Care Health System (Newark, Delaware) and the University of Maryland (Baltimore, Maryland). A total of 382 patients were enrolled between November 2003 and April 2015. Patients were excluded if they had contraindications to CMR, New York Heart Association (NYHA) functional class IV, acute myocarditis, acute sarcoidosis, infiltrative disorders (for example, amyloidosis), congenital heart disease, hypertrophic cardiomyopathy or renal insufficiency (creatinine clearance <30 ml min−1 after July 2006 or <60 ml min−1 after February 2007). The protocol was approved by the institutional review boards at each site, and all participants provided informed consent. CMR imaging was performed within a median time of 3 days before ICD implantation. The current study focused on the ischemic cardiomyopathy patient subset with adequate LGE-CMR, totaling 156 patients. As part of the clinical study, the participants had undergone single-chamber or dual-chamber ICD or cardiac resynchronization with an ICD implantation based on current guidelines. The programming of anti-tachycardia therapies was left to the discretion of the operators.

#### PRE-DETERMINE and DETERMINE Registry cohorts (external)

The PRE-DETERMINE (ClinicalTrials.gov ID NCT01114269) and accompanying DETERMINE (ClinicalTrials.gov ID NCT00487279) Registry study populations are multi-center, prospective cohort studies comprised of patients with coronary disease on angiography or documented history of myocardial infarction (MI). The PRE-DETERMINE study enrolled 5,764 patients with documented MI and/or mild to moderate left ventricle dysfunction (LVEF between 35% and 50%) who did not fulfill consensus guideline criteria for ICD implantation on the basis of LVEF and NYHA class (that is, LVEF >35% or LVEF between 30% and 35% with NYHA Class I heart failure) at study entry6. Exclusion criteria included a history of cardiac arrest not associated with acute MI, current or planned ICD or life expectancy of less than 6 months. The accompanying DETERMINE Registry included 192 participants screened for enrollment in PRE-DETERMINE who did not fulfill entry criteria on the basis of having an LVEF <30% (n = 99), an LVEF between 30% and 35% with NYHA Class II–IV heart failure (n = 19) or an ICD (n = 31) or who were unwilling to participate in the biomarker component of PRE-DETERMINE (n = 43). Within these cohorts, 809 participants had LGE-CMR imaging performed. Within this subset of patients, 23 cases of SCD occurred and were matched to four controls on age, sex, race, LVEF and follow-up time using risk set sampling. Of the resulting 115 patients, the current study focused on 113 patients with adequate LGE-CMR images for analysis. Finally, covariate data for this cohort were minimally ‘harmonized’ with the internal cohort, by retaining common covariates only. Some important differences between the external and internal cohorts remained, such as significantly higher LVEF in the external cohort.

#### LGE-CMR acquisition

The CMR images in the internal and external cohort were acquired using 1.5-T magnetic resonance imaging devices (Signa, GE Medical Systems; Avanto, Siemens). The exact software versions for the devices cannot be precisely retroactively ascertained given the very broad nature of the study. All were two-dimensional (2D) parallel short-axis left ventricle stacks. The contrast agent used was 0.15−0.20 mmol kg−1 of gadodiamide (Omniscan, GE Healthcare) or gadopentetate dimeglumine (Magnevist, Schering), and the scan was captured 10−30 minutes after injection. Owing to the multi-center nature of the clinical studies considered here, there were variations in CMR acquisition protocols. The most commonly used sequence was inversion recovery fast gradient echo pulse, with an inversion recovery time typically starting at 250 ms and adjusted iteratively to achieve maximum nulling of normal myocardium. Typical spatial resolutions ranged 1.5−2.4 × 1.5−2.4 × 6−8 mm, with 2−4-mm gaps. CMR images in the external cohort were sourced from 60 sites with a variety of imaging protocols, whereas those in the internal cohort originated from three sites and were more homogeneous. No artifact corrections were applied to the images. More details regarding on CMR acquisition can be found in previous work11,13,31,32.

#### Clinical data and primary endpoint

In both LVSPSCD and PRE-DETERMINE/DETERMINE cohorts, baseline data on demographics, clinical characteristics, medical history, medications, lifestyle habits and cardiac test results were collected (see Table 1 for a list of the common ones between the cohorts that were used in SSCAR). The primary endpoint for LVSPSCD was SCDA defined as therapy from the ICD for rapid ventricular fibrillation or tachycardia or a ventricular arrhythmia not corrected by the ICD. For the PRE-DETERMINE studies, the primary endpoint was sudden and/or arrhythmic death. Deaths were classified according to both timing (sudden versus non-sudden) and mechanism (arrhythmic versus non-arrhythmic). Unexpected deaths due to cardiac or unknown causes that occurred within 1 hour of symptom onset or within 24 hours of being last witnessed to be symptom free were considered sudden cardiac deaths. Deaths preceded by an abrupt spontaneous collapse of circulation without antecedent circulatory or neurological impairment were considered arrhythmic in accordance with the criteria outlined by Hinkle and Thaler16. Deaths that were classified as non-arrhythmic were excluded from the endpoint regardless of timing. Out-of-hospital cardiac arrests due to ventricular fibrillation that were successfully resuscitated with external electrical defibrillation were considered aborted arrhythmic deaths and were included in the primary endpoint.

#### Data preparation

The inputs to our model were the unprocessed LGE-CMR scans and the clinical covariates listed in Table 1. The training targets were the event time and event type (SCDA or non-SCDA). As a pre-processing step, the raw LGE-CMR scans were first segmented for left ventricle myocardium using a method based on convolutional neural networks developed and described in previous work33. In brief, this segmentation network consisted of three sub-networks: a U-net with residual connections trained to identify the entire region of interest; a U-net with residual connections trained to delineate the myocardium wall; and an encoder–decoder tasked with correcting anatomical inaccuracies that may have resulted in the segmentation. In this context, anatomical correctness was defined via a list of pass/fail rules (for example, no holes in the myocardium, circularity threshold and no disconnected components). Once each patient’s LGE–CMR 2D slices were segmented via this method, they were stacked, all voxels outside the left ventricle myocardium were zeroed out and the slices were sorted apex-to-base using DICOM header information and step-interpolated on a regular 64 × 64 × 12 grid with voxel dimensions 2.5 × 2.5 × 10 mm. These dimensions were chosen to make all patient volumes consistent with minimal interpolation from the original resolution while allowing enough room to avoid truncating the left ventricle. Finally, the input to the neural network model consisted of a two-channel volume (that is, 64 × 64 × 12 × 2). The first channel was a one-hot encoding of the myocardium and blood pool masks. The second channel had zeros outside of the myocardium and the original CMR intensities on the myocardium, linearly scaled by multiplication with half the inverse of the median blood pool intensity in each slice. To mitigate overfitting, train/time data augmentation was performed on the images, specifically 3D in-plane rotations in increments of 90° to avoid artifacts and panning of the ventricle within the 3D grid. The clinical covariate data were de-meaned and scaled by the standard deviation.

### Survival model

#### Statistical fit

For each patient i, the outcome data were the pair (Xi, Δi), where Xi is the minimum between the time to SCDA from arrhythmia Ti and the (right) censoring time Ci, after which either follow-up was lost or the patient died due to a competing risk. The outcome Δi is 1 if the patient had the arrhythmic event before they were censored (TiCi) and 0 otherwise. We estimated the (pseudo-)survival probability function Si(t), the probability that the time to SCDA exceeds t. We modeled the Ti values as independent, each having a cause-specific hazard rate34 based on the log-logistic distribution with location parameter μi and scale parameter σi, such that $${S}_{i}(t;{\mu }_{i},{\sigma }_{i})=1/\{1+\exp [(\log t-{\mu }_{i})/{\sigma }_{i}]\}$$. The patient-specific parameters μi and σi were modeled as outputs of neural networks applied to LGE-CMR images and clinical covariates, trained by minimizing the loss function given by the negative likelihood:

$$-\log {{{\mathcal{L}}}}=-\mathop{\sum}\limits_{i}\frac{\log {x}_{i}-{\mu }_{i}}{{\sigma }_{i}}+{\delta }_{i}\log {\sigma }_{i}+(1+{\delta }_{i})\log \left[1+\exp \left(\frac{\log {x}_{i}-{\mu }_{i}}{{\sigma }_{i}}\right)\right],$$

where xi is the observed time and δi is the censoring status. With μi and σi estimated, the patient-specific survival functions were given by Si(t) as above.

#### Performance metrics

The all-time performance of the models was evaluated using two measures. The first was Harrell’s c-index14 with the patient-specific μi parameters as the risk scores ($$\exp ({\mu }_{i})$$ is the mode of the log-logistic distribution) to gauge the model’s risk discrimination ability. The second was the integrated Brier score15, which is defined as the time-average of mean squared error (MSE) between true 0/1 outcome and predicted outcome probability and gauges both probability calibration and discrimination. Both measures were adjusted for censoring, corrected by weighing with the inverse probability of censoring, and calculated for data before a given cutoff time τ35; if unspecified, τ = 10 years, corresponding with the maximum event time in the dataset. Metrics derived from the confusion matrix (for example, precision and recall) were computed at several time points (τ = 2, 3… years). Probability thresholds at these times were selected by maximizing F-score (for precision, recall and F-score) or Youden’s J statistic (for sensitivity, specificity and balanced accuracy) on the training data. Of note, to preserve consistency in evaluation between the internal and external cohorts, metrics computed on the external cohort were not covariate adjusted, potentially underestimating performance36.

#### Neural network architecture

SSCAR is a supervised survival analysis regression model composed of two sub-networks, each operating on different input types (Fig. 1): a convolutional sub-network (‘CMR’), which takes the LGE-CMR images as inputs, and a dense sub-network (‘covariate’). which uses the clinical covariate data. Feature extraction in the CMR sub-network from the LGE–CMR images was achieved by a 3D convolutional encoder–decoder model. The encoder used a sequence of 3D convolutions and pooling layers, followed by one dense layer to encode the original 3D volume into a lower-dimensional vector. Non-linear activation functions and dropout layers were added before each downsampling step. The encoding was further used for two purposes: survival and reconstruction. For the survival branch, the encoding was first stratified into one of r (learned) risk categories (Supplementary Table 2) and then fed to a two-unit dense layer to predict—for each patient—a set of two parameters, location μ and scale σ, which fully characterized the probability distribution of the patient’s log-time to SCDA (see the ‘Statistical fit’ section), followed by a bespoke activation function. This activation function clipped $$\ln \mu$$ on [ − 3, 3] and clipped σ from below at σmin, where σmin was found such that the difference between the 95th and 5th percentiles of the predicted TSCDA distribution was no less than 1 month. This survival activation function effectively restricted the ‘signal-to-noise’ ratio μ/σ. For the purpose of reconstruction, the encoding was decoded via a sequence of transposed convolutions to re-create the original volume. Feature extraction from the clinical covariate data was performed using a sequence of densely connected layers, followed by a dropout layer to prevent overfitting. The resulting tensor used a similar path to the one followed by the convolutional encoding to eventually map to the two survival parameters. Finally, once the two sub-networks were trained, they were frozen and joined using a learned linear combination layer to ensemble the survival predictions.

The predicted survival parameters (location and scale) aimed to minimize the aforementioned negative log-likelihood function for the log-logistic distribution, accounting for censoring in the data and class imbalance. The reconstructed output of the CMR sub-network minimized the MSE to the original input. Its contribution to total loss was learned to provide regularization to the imaging features extracted, ensuring that the survival fit relied on features able to reconstruct the original image. Both stochastic gradient descent (SGD) and Adam37 optimizers were used. All code was developed in Python 3.7 using Keras 2.2.4 (ref. 38), TensorFlow 1.15 (ref. 39), numpy 1.6.2, scipy 1.2.1, openCV 3.4.2, pandas 0.24.2 and pydicom 1.2.2. Each train/evaluate fold took 3–5 minutes on an Nvidia Titan RTX graphics processing unit.

### Training and testing

The entire model development and internal validation were performed using the LVSPSCD cohort. After a hyperparameter tuning step, the best model architecture was then used on the entire internal validation set to find the best neural network weights. As the ensembling layer was hyperparameter free, it did not use hyperparameter tuning.

#### Hyperparameter tuning

A hyperparameter search was performed using the set of parameter values described in Supplementary Table 2, given the vast number of hyperparameter configurations available to define the model architectures. The package hyperopt 0.1.2 (ref. 40) was used to sample parameter configurations from the search space using the Parzen window algorithm to minimize the average validation loss resulting from a stratified ten-times-repeated ten-fold cross-validation process. The maximum number of iterations was 300 for the covariate sub-network and lowered to 100 for the CMR sub-network, given its highly increased capacity. Each fold was run using early stopping based on the loss value on a withheld 10% portion of the training fold with a maximum of 2,000 epochs (20 gradient updates per epoch). In hyperparameter tuning, models were optimized using SGD with a learning rate of .01 (default value in the neural network package used). The architecture with the highest Harrell’s c-index14 was selected. Hyperparameters deemed to have little effect on learning (for example, maximum number of epochs) were fixed. Convolutional kernel size and the activation function for convolutions were kept at the default values in the neural network package used. The batch size was set to the highest value, given the memory constraints of our hardware.

#### Internal validation and external test

Internal model performance was assessed using ten repetitions of stratified ten-fold cross-validation on the LVSPSCD cohort. Early stopping based on the c-index on a withheld 10% subset was implemented with a maximum training of 2,000 epochs (20 gradient updates per epoch). The optimizer was Adam with learning rate 10−5 for the CMR sub-network, 5 × 10−4 for the covariate sub-network and .01 for the ensemble. A final model was trained with all the available LVSPSCD data and tested on the PRE-DETERMINE cohort. Of note, the final model shares the same architecture and training parameters with all the models in the 100 internal data splits but has different (fine-tuned) weights, which are derived using the entire internal dataset. To estimate CIs on the external cohort, the same cross-validation process was applied to the PRE-DETERMINE cohort, supplementing the training data in each fold with the LVSPSCD cohort. Approximate normal CIs were constructed using the 100 folds.

#### Gradient-based interpretation of SSCAR

The trained network weights in SSCAR were interpreted for both covariate and CMR sub-network using the gradients of outputs with respect to intermediary neural network internal representations of data. For the CMR sub-network, we adapted Grad-CAM41 to work on regression problems and applied it to SSCAR by performing a weighted average of the last convolutional layer feature maps, where the weights were averages of gradients of the location parameter output with respect to each channel. The result was then interpolated back to the original image dimensions and overlaid to obtain the gradient maps shown (Fig. 4a, bottom row). For the covariate sub-network, the gradient of the location parameter output was taken with respect to each of the inputs and averaged over three groups: all patients, patients with SCDA and patients with no SCDA.

### Statistical analysis

All values reported on the internal validation data set were averages over 100 data splits resulting from a ten-times-repeated ten-fold stratified cross-validation scheme. Values reported on the external test dataset represented a single evaluation on the entire set. All CIs were normal approximations resulting from the aforementioned 100 splits. In computing CIs for the external test set, the same procedure was used on all available data, ensuring that test folds came exclusively from the external dataset. Error bars are standard errors with sample standard deviation estimated from the 100 splits. Correlation P value was based on the exact distribution under the bivariate normal assumption. Covariate P values are based on two-sample Welch’s t-test42 for continuous variables and Mann–Whitney U-test for categorical variables. Cox proportional hazards analysis was performed using the Python lifelines 0.25.5 (ref. 43) package; it included a hyperparameter sweep for the 1 and 2 regularization terms and followed the same train/test procedure as the neural network models.

### Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

## Data availability

Patient data used in this manuscript cannot be made publicly available without further consent and ethical approval, owing to privacy concerns. The CMR images and patient clinical data can be provided by the authors pending Johns Hopkins University institutional review board and Brigham and Women’s Hospital institutional review board approval and a completed material transfer agreement. Requests for these data should be sent to N.A.T. and/or C.M.A.

## Code availability

The code for this project is available under the Johns Hopkins University Academic Software License Agreement at https://gitlab.com/natalia-trayanova/sscar.

## References

1. Fishman, G. I. et al. Sudden cardiac death prediction and prevention: report from a National Heart, Lung, and Blood Institute and Heart Rhythm Society Workshop. Circulation 122, 2335–2348 (2010).

2. Hayashi, M., Shimizu, W. & Albert, C. M. The spectrum of epidemiology underlying sudden cardiac death. Circ. Res. 116, 1887–1906 (2015).

3. Wong, C. X. et al. Epidemiology of sudden cardiac death: global and regional perspectives. Heart Lung Circ. 28, 6–14 (2019).

4. Al-Khatib, S. M. et al. 2017 AHA/ACC/HRS guideline for management of patients with ventricular arrhythmias and the prevention of sudden cardiac death: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines and the Heart Rhythm Society. J. Am. Coll. Cardiol. 72, e91–e220 (2018).

5. Russo, A. M. et al. ACCF/HRS/AHA/ASE/HFSA/SCAI/SCCT/SCMR 2013 appropriate use criteria for implantable cardioverter-defibrillators and cardiac resynchronization therapy: a report of the american college of cardiology foundation appropriate use criteria task force, heart rhythm society, american heart association, american society of echocardiography, heart failure society of america, society for cardiovascular angiography and interventions, society of cardiovascular computed tomography, and society for cardiovascular magnetic resonance. J. Am. Coll. Cardiol. 61, 1318–1368 (2013).

6. Wellens, H. J. et al. Risk stratification for sudden cardiac death: current status and challenges for the future. Eur. Heart J. 35, 1642–1651 (2014).

7. Ganesan, A. N., Gunton, J., Nucifora, G., McGavigan, A. D. & Selvanayagam, J. B. Impact of late gadolinium enhancement on mortality, sudden death and major adverse cardiovascular events in ischemic and nonischemic cardiomyopathy: a systematic review and meta-analysis. Int. J. Cardiol. 254, 230–237 (2018).

8. Deyell, M. W., Krahn, A. D. & Goldberger, J. J. Sudden cardiac death risk stratification. Circ. Res. 116, 1907–1918 (2015).

9. Sabate Rotes, A. et al. Ventricular arrhythmia risk stratification in patients with tetralogy of Fallot at the time of pulmonary valve replacement. Circ. Arrhythm. Electrophysiol. 8, 110–116 (2015).

10. Okada, D. R. et al. Substrate spatial complexity analysis for the prediction of ventricular arrhythmias in patients with ischemic cardiomyopathy. Circ. Arrhythm. Electrophysiol. 13, e007975 (2020).

11. Wu, K. C. et al. Baseline and dynamic risk predictors of appropriate implantable cardioverter defibrillator therapy. J. Am. Heart Assoc. 9, e017002 (2020).

12. Arevalo, H. J. et al. Arrhythmia risk stratification of patients after myocardial infarction using personalized heart models. Nat. Commun. 7, 11437 (2016).

13. Schmidt, A. et al. Infarct tissue heterogeneity by magnetic resonance imaging identifies enhanced cardiac arrhythmia susceptibility in patients with left ventricular dysfunction. Circulation 115, 2006–2014 (2007).

14. Harrell, F. E. Jr, Lee, K. L. & Mark, D. B. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med. 15, 361–387 (1996).

15. Graf, E., Schmoor, C., Sauerbrei, W. & Schumacher, M. Assessment and comparison of prognostic classification schemes for survival data. Stat. Med. 18, 2529–2545 (1999).

16. Chatterjee, N. A. et al. Sudden death in patients with coronary heart disease without severe systolic dysfunction. JAMA Cardiol. 3, 591–600 (2018).

17. Hinton, G. Deep learning—a technology with the potential to transform health care. JAMA 320, 1101–1102 (2018).

18. Naylor, C. D. On the prospects for a (deep) learning health care system. JAMA 320, 1099–1100 (2018).

19. Wang, F., Casalino, L. P. & Khullar, D. Deep learning in medicine-promise, progress, and challenges. JAMA Int. Med. 179, 293–294 (2019).

20. Ching, T. et al. Opportunities and obstacles for deep learning in biology and medicine. J. R. Soc. Interface 15, 20170387 (2018).

21. Miotto, R., Wang, F., Wang, S., Jiang, X. & Dudley, J. T. Deep learning for healthcare: review, opportunities and challenges. Brief. Bioinform. 19, 1236–1246 (2018).

22. Gould, J. et al. Mean entropy predicts implantable cardioverter-defibrillator therapy using cardiac magnetic resonance texture analysis of scar heterogeneity. Heart Rhythm 16, 1242–1250 (2019).

23. Roes, S. D. et al. Infarct tissue heterogeneity assessed with contrast-enhanced MRI predicts spontaneous ventricular arrhythmia in patients with ischemic cardiomyopathy and implantable cardioverter-defibrillator. Circ. Cardiovasc. Imaging 2, 183–190 (2009).

24. Miller, M. A., Gomes, J. A. & Fuster, V. Risk stratification of sudden cardiac death in hypertrophic cardiomyopathy. Nat. Clin. Pract. Cardiovasc. Med. 4, 667–676 (2007).

25. Ebrahimi, Z., Loni, M., Daneshtalab, M. & Gharehbaghi, A. A review on deep learning methods for ECG arrhythmia classification. Expert Syst. Appl. 7, 100033 (2020).

26. Hannun, A. Y. et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med. 25, 65–69 (2019).

27. Sannino, G. & De Pietro, G. A deep learning approach for ECG-based heartbeat classification for arrhythmia detection. Future Gener. Comput. Syst. 86, 446–455 (2018).

28. Yıldırım, Ö., Pławiak, P., Tan, R. S. & Acharya, U. R. Arrhythmia detection using deep convolutional neural network with long duration ECG signals. Comput. Biol. Med. 102, 411–420 (2018).

29. Salem, M., Taheri, S. & Yuan, J. S. ECG arrhythmia classification using transfer learning from 2-dimensional deep CNN features. In: 2018 IEEE Biomedical Circuits and Systems Conference (BioCAS) 1–4 (IEEE, 2018).

30. Strauss, D. G. et al. An ECG index of myocardial scar enhances prediction of defibrillator shocks: an analysis of the Sudden Cardiac Death in Heart Failure Trial. Heart Rhythm 8, 38–45 (2011).

31. Zghaib, T. et al. Standard ablation versus magnetic resonance imaging-guided ablation in the treatment of ventricular tachycardia. Circ. Arrhythm. Electrophysiol. 11, e005973 (2018).

32. Kadish, A. H. et al. Rationale and design for the defibrillators to reduce risk by magnetic resonance imaging evaluation (DETERMINE) trial. J. Cardiovasc. Electrophysiol. 20, 982–987 (2009).

33. Popescu, D. M. et al. Anatomically informed deep learning on contrast-enhanced cardiac magnetic resonance imaging for scar segmentation and clinical feature extraction. Cardiovascular Digital Health Journal 3, 2–13 (2022).

34. Jeong, J. H. & Fine, J. Direct parametric inference for the cumulative incidence function. J. R. Stat. Soc. Ser. C Appl. Stat. 55, 187–200 (2006).

35. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30, 1105–1107 (2011).

36. Pepe, M. S., Fan, J. & Seymour, C. W. Estimating the receiver operating characteristic curve in studies that match controls to cases on covariates. Acad. Radiol. 20, 863–873 (2013).

37. Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).

38. Chollet, F. et al. Keras. https://github.com/fchollet/keras (2015).

39. Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems. Preprint at https://arxiv.org/abs/1603.04467 (2016).

40. Bergstra, J., Yamins, D. & Cox, D. Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In International Conference on Machine Learning 115–123 (PMLR, 2013).

41. Selvaraju, R. R. et al. Grad-cam: visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision 618–626 (IEEE, 2017).

42. Welch, B. L. The generalization of student’s problem when several different population variances are involved. Biometrika 34, 28–35 (1947).

43. Davidson-Pilon, C. et al. CamDavidsonPilon/lifelines: v0.25.9. https://doi.org/10.5281/zenodo.4505728 (2021).

## Acknowledgements

The authors would like to acknowledge support from National Institutes of Health grants R01HL142496 (N.A.T.), R01HL126802 (N.A.T.) and R01HL103812 (K.C.W.), the Lowenstein Foundation (N.A.T.), National Science Foundation Graduate Research Fellowship DGE-1746891 (J.K.S.), the Simons Fellowship for 2020–2021 (M.M.), National Science Foundation grant IIS-1837991 (M.M.), Air Force Office of Scientific Research FA9550-20-1-0288 and FA9550-21-1-0317 (M.M.) and an Abbott Laboratories research grant (D.C.L.). The PRE-DETERMINE study and the DETERMINE Registry were supported by National Heart, Lung, and Blood Institute research grant R01HL091069 (C.M.A.), St. Jude Medical Inc. and the St. Jude Medical Foundation.

## Author information

Authors

### Contributions

D.M.P., J.K.S., C.M.A., K.C.W., M.M. and N.A.T. contributed to the study design. K.N.A., M.V.M., D.C.L., A.K., C.M.A. and K.C.W. provided clinical perspective and interpretation of clinical variables. C.L., M.V.M. and D.C.L. assisted with data curation. D.M.P., M.M. and N.A.T. developed the methodology. D.O. and N.R.C. assisted with the statistical and machine learning methodology. D.M.P. was responsible for conceptualization, formal analysis, investigation, software development and writing the manuscript. N.A.T. and M.M. were the senior supervisors on all aspects of the project and contributed to the writing. All authors read, edited and approved the final manuscript.

### Corresponding author

Correspondence to Natalia A. Trayanova.

## Ethics declarations

### Competing interests

The machine learning techniques for predicting sudden cardiac death survival discussed in this manuscript relate to pending US provisional patent application 63/287,395 naming Johns Hopkins University as the applicant and listing N.A.T., D.M.P., J.K.S. and M.M. as the inventors.

## Peer review

### Peer review information

Nature Cardiovascular Research thanks Chayakrit Krittanawong, Kalyanam Shivkumat and the other, anonymous, reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Extended data

### Extended Data Fig. 1 Survival Study of Cardiac Arrhythmia Risk Results at Various Time Points.

Receiver operator characteristic curves (ROC) for years 2–9 for the internal and external cohorts, with the respective areas under the curve (AUROC). Predicted outcomes are based on the estimated survival probability at the respective time points as computed from the survival probability function. Shaded areas represent approximate 95% confidence intervals, solid and dashed lines indicate the internal and external cohort averages, respectively, and random chance performance thresholds are shown using dotted lines.

### Extended Data Fig. 2 Individual Patient Survival Probability.

a. Survival probability curves are shown for an example patient in the external test set who experienced sudden cardiac death from arrhythmia (SCDA, left display), and for one who did not (No SCDA, right display). Survival probability curves are plotted over time for Survival Study of Cardiac Arrhythmia Risk (SSCAR) (solid, blue), Cox proportional hazards (Cox PH) model (dashed, green) on the clinical covariates, Kaplan-Meier estimator (dot-dashed, purple), together with the indicator ground truth (dotted, black). For the patient with SCDA, SSCAR crosses the 50% survival probability threshold considerably closer to the SCDA time, as compared to the alternative curves, highlighting the model’s high calibration. For the censored patient (no SCDA), SSCAR estimates higher survival probability at the time of non-SCDA event compared to the other models. b. Examples of SSCAR’s predicted probability distributions for the time to SCDA (shaded areas, pdf(TSCDA)) for two patients in the external test set who experienced SCDA (P1, blue and P2, orange). The predicted times to event (Predicted TSCDA) are depicted as solid vertical lines (peaks of distributions); actual times (Actual TSCDA are depicted by dotted vertical lines. Note that SSCAR has a larger prediction error in P2 compared to P1, seen on the graph as the distance between the respective solid and dotted lines. However, SSCAR ‘recognizes’ the inaccurate TSCDA prediction and compensates for that by also predicting a more spread out distribution (larger scale parameter) for P2. This direct relationship between the prediction error and predicted scale parameter holds more generally for the entire dataset, suggesting SSCAR learns to quantify the degree of inaccuracy in the TSCDA prediction.

### Extended Data Fig. 3 Survival Study of Cardiac Arrhythmia Risk Cardiac Magnetic Resonance (CMR) Sub-network.

Top-left panel (yellow) shows patient data used in by the CMR sub-network. The CMR sub-network uses as input contrast-enhanced (LGE) CMR images with the left ventricle automatically segmented (top inset). Labels associated with each patient (SCDA Events, bottom inset, dot-dashed contour) - consisting of the observed time to event, and an indicator whether the event was sudden cardiac death from arrhythmia (SCDA) or non-SCDA - are used as targets during training only. LGE-CMR data is taken as input by a 3D convolutional neural network constructed using an encoder-decoder architecture (red panel, left). The encoder branch consists of a sequence of 3D convolutions, downsampling (maxpool), and rectified linear unit (ReLU) activation functions. The encoded version of the image is transformed using a dense layer with a custom survival activation function into two parameters (location, μ and scale, σ, bottom panel, blue) for each patient, which fully characterize the probability distribution function of the patient-specific time to SCDA. The decoder branch uses transpose convolutions to reconstruct the original images, thereby ensuring that the encoded version of the CMR data is meaningful and able to reproduce the original images. During training (dot-dashed arrows), the neural network weights are optimized via a likelihood-based loss and a reconstruction loss.

### Extended Data Fig. 4 Clinical Trial Data Diagram.

On the left (continuous outline), the flowchart shows the eligibility assessment process for the internal cohort (LVSPSCD), a subset of which was used for cross-validation and final model training. On the right (dashed outline), the flowchart shows the eligibility assessment process for the external cohort (PRE-DETERMINE and DETERMINE) used for testing the final model.

## Supplementary information

### Supplementary Information

Supplementary Tables 1 and 2

## Source data

### Source Data Fig. 2

Statistical Source Data

### Source Data Fig. 3

Statistical Source Data

### Source Data Fig. 4

Statistical Source Data and Raw Images

### Source Data Extended Data Fig. 1

Statistical Source Data

### Source Data Extended Data Fig. 2

Statistical Source Data

## Rights and permissions

Reprints and Permissions

Popescu, D.M., Shade, J.K., Lai, C. et al. Arrhythmic sudden death survival prediction using deep learning analysis of scarring in the heart. Nat Cardiovasc Res 1, 334–343 (2022). https://doi.org/10.1038/s44161-022-00041-9

• Accepted:

• Published:

• Issue Date:

• DOI: https://doi.org/10.1038/s44161-022-00041-9

• ### The next step in deep learning-guided clinical trials

• Chayakrit Krittanawong

Nature Cardiovascular Research (2022)