Machine learning model for predicting late recurrence of atrial fibrillation after catheter ablation

Late recurrence of atrial fibrillation (LRAF) in the first year following catheter ablation is a common and significant clinical problem. Our study aimed to create a machine-learning model for predicting arrhythmic recurrence within the first year since catheter ablation. The study comprised 201 consecutive patients (age: 61.8 ± 8.1; women 36%) with paroxysmal, persistent, and long-standing persistent atrial fibrillation (AF) who underwent cryoballoon (61%) and radiofrequency ablation (39%). Five different supervised machine-learning models (decision tree, logistic regression, random forest, XGBoost, support vector machines) were developed for predicting AF recurrence. Further, SHapley Additive exPlanations were derived to explain the predictions using 82 parameters based on clinical, laboratory, and procedural variables collected from each patient. The models were trained and validated using a stratified fivefold cross-validation, and a feature selection was performed with permutation importance. The XGBoost model with 12 variables showed the best performance on the testing cohort, with the highest AUC of 0.75 [95% confidence interval 0.7395, 0.7653]. The machine-learned model, based on the easily available 12 clinical and laboratory variables, predicted LRAF with good performance, which may provide a valuable tool in clinical practice for better patient selection and personalized AF strategy following the procedure.

support predicting LRAF [12][13][14] .Although deep learning models provide high prediction accuracy 15 , explaining their predictions remains a challenging step, and as stated by the authors of 12 : "we cannot provide an explicit survival function or equation, and we cannot suggest specific cut-off values of predictors because of the 'blackbox' characteristic of the model".
In this study, we compared selected top-performing machine learning models for predicting LRAF following PV isolation by cryoballoon or RF ablation.Next, we derived visual explanations using the well-known SHapley Additive exPlanations (SHAP) framework 16 .SHAP enables us to assign an importance value for each feature in a particular prediction.Understanding why the model makes a certain prediction is as important as the accuracy of developed models.Concluding, similarly to Kim et al., we considered ERAF as an explanatory variable, and the major contribution of this work is confirming that ERAF is an important predictor of LRAF 10 .

Methods
Study population.This study comprised 201 consecutive patients with documented symptomatic paroxysmal, persistent, and long-standing persistent AF.The patients were scheduled to undergo cryoballoon and RF ablation using the CARTO-mapping at the Cardiology Department in the Multidisciplinary Hospital in Nowa Sól, Poland.A total of 164 patients underwent the PV isolation procedure for the first time, while 34 and 3 patients underwent ablation for the second and third time, respectively 9 .Exclusion criteria included intracardiac thrombi, myocardial infarction, stroke or cardiac surgery in the previous 3 months, malignancies, autoimmune or inflammatory disease, antibiotic therapy, and heart failure exacerbation.All the patients signed a written study participation consent while the study protocol was approved by the Medical Ethics Committee at Poznań University of Medical Sciences (Approval 44/16).The study was carried out in May 2016 until March 2018 9 .All the participants fulfilled the criteria and completed the study.

Radiofrequency ablation. Pre-procedural transthoracic and transoesophageal echocardiography (TEE)
were performed in all the patients prior to ablation.RF ablation was performed using the focal ablation strategy guided by the CARTO 3-D mapping system (Biosense Webster, Diamond Bar, CA).The ablation procedure was performed under local anesthesia with mild conscious sedation.The double transseptal puncture with LassoNav and Navistar ST electrodes was performed following the fluoroscopic guidelines.Immediately after the puncture, intravenous unfractioned heparin (UFH) was administered.An activated clotting time was maintained between 300 and 350 s 11 .The voltage map of left atrium and PVs was performed.PV isolation was performed using 7F Navistar ThermoCool and 8F ThermoCool SmartTouch SF (Biosense Webster, Diamond Bar, CA).The standard energy settings were 30 Watts for 30 s at the anterior LA wall, and 20 Watts at the posterior LA wall.In 5 patients, RF ablation was performed using the "ablation index" algorithm 9 .The verification of the lines was done using the voltage map.Additional cavotricuspid isthmus ablation was performed in the patients with a concomitant typical atrial flutter.Additional ablations such as low-voltage zone ablation, complex fractionated atrial electrogram-guided ablation, or linear ablation were performed at the operator's discretion if AF was induced after PV isolation.

Cryoballoon ablation.
All the procedures were performed under local anesthesia with mild conscious sedation.In the cryoablation group, the second-generation cryoballoon ablation catheter was used (Arctic Front Advanced, Medtronic, Minneapolis, MN, USA).The patients had a groin entry venous route catheter introduced with the transseptal puncture by means of a Brockenbrough needle (St.Jude Medical).In addition, a 15 Fr steerable sheath (FlexCath Advance, Medtronic) and an integrated inner-lumen circular mapping catheter (CMC, Achieve™; Medtronic, Inc.) were applied 9 .The cryoballoon was introduced into the target PV, and its position was confirmed by contrast retention.The freezing cycles, lasting 180-240 s, were performed for each PV and were confirmed by the Achieve catheter 9 .In the absence of PV potentials, the procedure was ended; otherwise, next cryoapplications were performed.During the application in the right veins, the diphragmatic nerve was constantly stimulated (30/min) to avoid its paralysis.Freezing was immediately terminated at any indication of diaphragmatic weakness or palsy.
Biochemical analyses.Blood samples were obtained at baseline and 24 h after ablation 9 .Venous blood was drawn from the basilic vein.All routine biochemical analyses (hsTnT, CK, CKMB, CRP, D-dimer, fibrinogen) were performed immediately in the central hospital laboratory.High-sensitivity TnT (hsTnT) was analyzed by electrochemiluminescence immunoassay (ECLIA) The principle of the Sandwich ECLIA method involves the immobilization of Troponin T using a biotinylated monoclonal anti-Troponin T antibody and a monoclonal anti-Troponin T antibody labeled with a ruthenium complex.HsTnT were measured by means of a Cobas c601 device with a cut-off value of 14 pg/L (Roche Diagnostics GmbH, Germany).The serum creatinine level was measured using Creatinine Jaffe Gen.2 kits (CREJ2; Roche, Mannheim, Germany) based on a kinetic colorimetric assay.This kinetic colorimetric assay is based on the Jaffé method.In alkaline solution, creatinine forms a yellow-orange complex with picrate.The rate of dye formation is proportional to the creatinine concentration in the specimen.The levels of sodium and potassium were determined by indirect potentiometry using ion-selective electrodes (ISE) (COBAS C501, Roche, Germany).Serum aspartate aminotransferase (AST) and alanine aminotransferase (ALT) were determined by means of a kinetic method with NADH and TRIS buffer (Roche, Mannheim, Germany).CK was marked using a kinetic serum test with fosfocreatine and ADP.Creatine kinase (CK) catalyzes the reaction between creatine phosphate (CP) and adenosine 5′-diphosphate (ADP) with formation of creatine and adenosine 5´-triphosphate (ATP).The latter phosphorylates glucose to glucose-6-phosphate (G6P) in the presence of hexoquinase (HK).G6P is oxidized to Gluconate-6P in the presence of reduced nicotinamide-adenine dinucleotide phosphate (NADP) in a reaction catalyzed by glucose-6-phosphate dehydrogenase (G6P-DH).The conversion is monitored kinetically at 340 nm by the rate of increase in absorbance resulting from the reduction of NADP to NADPH proportional to the activity of CK present in the sample.CKMB was analyzed with CKMB immunoassay concentrations (Roche, Germany).The test contains two monoclonal antibodies against epitopes of the CK-MB molecule, one gold-labelled, the other biotinylated.The antibodies form a sandwich complex with CK-MB in the blood.CRP was tested with an immunoturbimetric latex CRP assay (Roche Diagnostics GmbH).Human CRP agglutinates with latex particles coated with monoclonal antiCRP antibodies.The precipitate is determined turbidimetrically.D-dimer assays were inspected with an immunoturbidimetric method using STA-Liatest D-Di Plus (Stago, France).The assay was calibrated with the calibrator of the manufacturer.Fibrinogen, INR, APTT were measured by STACompact Max mechanical coagulometer (Diagnostica Stago, France).The STA Compact Max analyser's method of measuring the coagulation time is based on the mechanical registration of the viscosity based detection system (VBDS).In the analysis, the peripheral blood count was marked with CELL-DYN Ruby using flow cytometric techniques combined with the MAPSS technology (Abbott Diagnostics, USA) 9 .In the study, residual fresh (< 4 h) ethylenediaminetetraacetic acid (EDTA)-anticoagulated samples were used.Normal reference ranges were as follows: WBC, 4.0-10.0(× 10 9 /L), Fibrinogen 200-400 mg/dl; CK 0-190 U/L and CK-MB 7-25 U/L.The CRP and D-dimer cut-off values were 0.5 mg/dl and 0.5 μg/ml respectively.The extent of biomarker elevation was defined as the post-procedure recorded value minus the baseline value (day 0).
Clinical follow-up.The patients were monitored for the first 24 h following ablation.The 24-h Holter monitoring was installed in an outpatient clinic within the first 3 months after ablation and every 6, 9, and 12 months during the follow-up (Mortara Instrument, Milwaukee, WI).Additionally, a 12-lead electrocardiogram (ECG) was recommended for the patients with the symptoms of arrhythmia.LRAF was defined as any symptomatic or asymptomatic atrial tachyarrhythmias (AF, atrial tachycardia [AT], or atrial flutter [AFL]) lasting > 30 s that occurred from 3 months to 1 year).Antiarrhythmic drugs (AAD) were not routinely used after ablation, except for the highly symptomatic patients with ERAF.Oral anticoagulants were continued for at least 2 months 11 .The decision to continue anticoagulation was based on the individual's stroke risk determined by the CHA2DS2-VASc score 9,11 .

Statistical analyses.
We considered the following two groups of patients depending on the occurrence of LRAF: (1) patients with LRAF; (2) lack of LRAF.The normal distribution of continuous variables was tested using the Shapiro-Wilk test.Next, the Mann-Whitney test was used for not normally distributed variables, and the Student's t-test was used for normally distributed variables.Also, the Pearson chi-square test for independence was applied for categorical variables.The analyses were done using the R programming language.The statistical threshold for significance for p values was 0.05.

Model development for AF prediction.
We formulated LRAF as a binary classification problem and predicted its occurence.In the experimental evaluation, we adapted the following top-performing benchmark machine learning algorithms: random forest (RF), decision trees (DT), support vector classifier, XGBoost (XGB), and logistic regression (LogR).The classifiers were constructed using the sklearn and XGBoost libraries from the Python programming language.Finally, SHAPley values were calculated to explain LRAF predictions.SHAP is one of the most frequently used model-agnostic methods and is commonly used for tabular data 16 .SHAP explanations were derived for the top-performing classifier, namely XGBoost, using the SHAP library for Python.
Experimental Set-up The dataset with all the patients was randomly split into training and test sets (90%) and a validation set (10%).Next, the repeated stratified fivefold cross-validation was applied to train the classification algorithms for the training set and select the optimal subsets of variables to be included in the predictive model.The permutation importance algorithm (with a number of permutations = 50) was applied to reduce dimensionality and select the subset of variables with the following indices i = 8, 12, 16, and 20.The subset of variables was considered optimal if the F1 score for the test set was maximal.Also, the HAS-BLED score (a scoring system developed to assess 1-year risk of major bleeding) was added to the subset of selected variables to improve the interpretation potential of the model outcomes.For a fair comparison, the same subsets of data were considered for each fold, regardless of the algorithm.

Ethics approval and consent to participate.
The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Ethics Committee of Poznan University of Medical Sciences (protocol code 44/16).Signed informed consent was obtained from every subject involved in the study.

Results
The study comprised 201 patients with AF treated with cryoablation (122 patients) and RF ablation (79 patients).Over 80 baseline clinical, procedural, and laboratory characteristics, stratified by the presence of LRAF during the follow-up, which were considered in this study are described in Table S1 in Supplementary Materials.Their statistical characteristics are summarized in Table 1.Additional laboratory data are presented in Table S2 in the Supplementary Materials.During the first year, LRAF occurred in 57 patients (28.3%).A 12-month follow-up was completed by all the patients.As shown in Table 1, patients with LRAF were significantly more likely to have a history of ERAF, coronary artery disease, and higher CHA2DS2-VASC score.Also, the patients with larger LA volume, higher number of applications, and longer procedure time showed a significantly higher risk of LRAF.Additionally, the extent of CK-MB elevation was significantly decreased in patients with LRAF compared to those without LRAF.
Vol:.( 1234567890  The respective receiver operating characteristic (ROC) curves for this model calculated for the validation set are presented in Fig. 1.It is observed that the XGBoost model with the 12 variables achieves the highest area under the curve (AUC) of 0.75.
The variables selected as most discriminative in this model are the following: ERAF, TSH, RBC, HAS-BLED score, BMI, statin therapy, parameters measured prior to ablation such as glucose, diastolic blood pressure, and urea, as well as parameters measured following ablation such as high-sensitive Troponin T, hemoglobin, and fibrinogen.Figure 2 shows the SHAP (SHapley Additive exPlanations) global explanation (summary) plot.Each point in the figure represents a classified data point, and the color code represents its range of feature values.SHAP presents the model output for a given class (here LRAF prediction) as an inverted pyramid of the most contributing features to that class.The high values of the top 2 features, ERAF and TSH, contribute to predicting LRAF most, while low values of the top 3 features, RBC, hsTroponin T following the procedure, and HDL cholesterol prior ablation, contribute positively to this class.
Figure 3 explains in detail the prediction from the considered XGB model for an illustrative patient from the validation set classified as a false positive (FP) patient.The red arrows represent the features that drive the prediction towards LRAF, while the blue arrows represent the features that drive the prediction against it.It is observed that a higher HAS-BLED score, lower RBC levels, statin therapy, and higher TSH are the factors that increased the risk of LRAF diagnosis.
Finally, in Fig. 4, we explain in detail the prediction from the considered XGB model for an illustrative patient from the validation set classified as a false negative (FN) patient.It is observed that in this example, lower TSH values, lack of statin therapy, higher RBC and hemoglobin levels following ablation are the factors that decreased the risk of LRAF diagnosis, even despite the occurrence of ERAF.

Discussion
Late recurrence of atrial fibrillation (LRAF) is a common phenomenon after pulmonary vein isolation, and the prognosis after the procedure is highly complex.Previous studies have demonstrated that machine learning techniques can be effectively applied for AF recurrence prediction and may have better performance than conventional statistical analysis 14 .In a recent systematic review of 33 studies developing or validating 13 models based on the c-statistic, highly variable discriminatory ability was observed, ranging from very poor to very good 13 .However, the risk of bias was high, and many studies lacked internal validation in model development.
In this study, we developed a machine learning model for predicting AF recurrence following catheter ablation in the first year after the procedure.In the examined group of 201 patients, LRAF occurred in 28% of them, with comparable frequency in both types of ablation.The proposed XGBoost model showed better performance in predicting LRAF compared to our previous model for ERAF 9 .
The XGBoost model with 12 variables commonly available in clinical practice showed the best performance on the testing cohort.As illustrated in Fig. 2, ERAF was the most important factor in the model.In addition, the SHAP results demonstrated that higher values of TSH, HAS-BLED score, statin therapy, fibrinogen, lower values of parameters measured after ablation such as RBC, troponin, hemoglobin, as well as lower values of parameters measured before ablation such as HDL, glucose, diastolic blood pressure, BMI and urea were associated with an increased risk of LRAF.Our observations show that ERAF is the factor that strongly predisposes patients to LRAF, as it comprises the most important contribution to the model.
Several other studies have reported ERAF as a very strong predictor of LRAF, both after single and multiple procedures 10,17,18 .Moreover, it has been proven that the cause of ERAF is not only the inflammatory process and tissue necrosis, but also reconnections within the pulmonary veins 18 .Therefore, Kim et al. suggest that ERAF may be a surrogate marker of the severity of AF itself and should not be considered as a transient phenomenon, but as a strong predictor of LRAF 10 .Thyroid disorders are increasingly recognized as a factor responsible for AF 19 .In the study of Morishima et al., hypothyroidism and high-normal TSH levels were independent predictors of atrial tachyarrhythmia recurrence following catheter ablation 20 .In the study of Kim et al., the hemoglobin level was also significantly lower (p < 0.001), and anemia was more commonly found (p < 0.001) in patients with clinical recurrence following ablation than in those who remained in sinus rhythm 21 .On the other hand, the HAS-BLED

Figure 1 .
Figure 1.Performance of selected classifiers with 12 features validated for the validation set is further illustrated with receiver operator characteristic (ROC) curves.

Figure 2 .
Figure 2. The SHAP summary plot from the XGB classifier shows the ranking of the top-most contributing features.The positive contribution towards that class is shown on the positive side of the X-axis (representing positive SHAP values), while the negative side of the axis represents a negative contribution or the contribution of those features against the prediction of that class.The XGBoost model, data samples and running examples in the Python programming language are available through the GitHub platform (https:// github.com/ kasia kaczm arek/ predi cting-late-recur rence-of-atrial-fibri llati on).

Figure 3 .
Figure 3. SHAP local explanations from XGB for an illustrative patient from the validation set classified as a false positive (FP) patient are shown in this figure.The red arrows represent the features that drive the prediction towards LRAF, while the blue arrows represent the features that drive the prediction against it.

Figure 4 .
Figure 4. SHAP local explanations from XGB for an illustrative patient from the validation set classified as a false negative (FN) patient.The red arrows represent the features that drive the prediction towards LRAF, while the blue arrows represent the features that drive the prediction against it.

Table 2
shows the comparative analysis of the performance of selected classifiers.As observed, XGBoost with 12 variables achieved the highest F1 score of 0.547.

Table 1 .
Baseline characteristics and comparison of patients with and without LRAF following catheter ablation.Continuous data of normal distribution are presented as mean ± standard deviation (SD).Continuous data of non-normal distribution are presented as mean IQR calculated as Q1-Q3.Categorical variables are presented as numbers and percentages.Categorical data are presented as counts with their percentage values in brackets.pvalues from the Student's t-test are reported for normal variables, p values from the Mann-Whitney test are reported for non-normally distributed variables.pValues from the Pearson chi square test for independence are reported for categorical variables.BMI body mass index, LRAF late recurrence atrial fibrillation, ERAF early recurrence atrial fibrillation, RF radiofrequency, LA volume left atrial volume, CHA2DS2-VASc congestive heart failure, hypertension, Age ≥ 75 (doubled), diabetes, stroke (doubled), vascular disease, age 65-74, sex (female), HAS-BLED hypertension, abnormal renal/liver function, stroke, bleeding history or predisposition, labile INR, elderly (> 65 years), drugs/alcohol concomitantly, SBP systolic blood pressure, DBP diastolic blood pressure, CAD coronary artery disease, CCB calcium channel blockers, NOAC non-vitamin K antagonist oral anticoagulant, VKA vitamin K antagonist, ACE-I angiotensin converting enzyme inhibitor, ARB angiotensin II receptor blocker, GFR glomerular filtration rate, CRP C-reactive protein,PLT platelets, hs-TnT high-sensitive cardiac troponin T, CPK creatine kinase, CK-MB creatine kinase-MB fraction; Δ-Delta denotes the response to the ablation procedure.Delta was defined as the change in the biomarker concentration between two assays performed within 24-h period (after ablation -before ablation).Significance difference LRAF(+) versus LRAF (−).Significant values are in bold.