Genome-wide association study and meta-analysis identify loci associated with ventricular and supraventricular ectopy

The genetic basis of supraventricular and ventricular ectopy (SVE, VE) remains largely uncharacterized, despite established genetic mechanisms of arrhythmogenesis. To identify novel genetic variants associated with SVE/VE in ancestrally diverse human populations, we conducted a genome-wide association study of electrocardiographically identified SVE and VE in five cohorts including approximately 43,000 participants of African, European and Hispanic/Latino ancestry. In thirteen ancestry-stratified subgroups, we tested multivariable-adjusted associations of SVE and VE with single nucleotide polymorphism (SNP) dosage. We combined subgroup-specific association estimates in inverse variance-weighted, fixed-effects and Bayesian meta-analyses. We also combined fixed-effects meta-analytic t-test statistics for SVE and VE in multi-trait SNP association analyses. No loci reached genome-wide significance in trans-ethnic meta-analyses. However, we found genome-wide significant SNPs intronic to an apoptosis-enhancing gene previously associated with QRS interval duration (FAF1; lead SNP rs7545860; effect allele frequency = 0.02; P = 2.0 × 10−8) in multi-trait analysis among European ancestry participants and near a locus encoding calcium-dependent glycoproteins (DSC3; lead SNP rs8086068; effect allele frequency = 0.17) in meta-analysis of SVE (P = 4.0 × 10−8) and multi-trait analysis (P = 2.9 × 10−9) among African ancestry participants. The novel findings suggest several mechanisms by which genetic variation may predispose to ectopy in humans and highlight the potential value of leveraging pleiotropy in future studies of ectopy-related phenotypes.

HCHS/SOL is a study focused on describing the prevalence of risk and protective factors for chronic conditions, and to quantify all-cause mortality, fatal and non-fatal cardiovascular disease and pulmonary disease, and pulmonary disease exacerbation over time 26 . From 200826 . From -2011 Hispanic/Latino individuals aged 18-74 were recruited from randomly selected households in four US communities: Bronx, NY; Chicago, IL; Miami, FL; and San Diego, CA using a stratified two-stage area probability sampling design. HCHS/SOL includes participants who self-identified as having Hispanic/Latino background, the largest groups being Central American (n = 1,730), Cuban (n = 2,348), Dominican (n = 1,460), Mexican (n = 6,471), Puerto-Rican (n = 2,728), and South American (n = 1,068). At the time of the present study, participants only had an ECG at the baseline visit.
Electrocardiography. Trained, certified technicians digitally recorded ECGs at visits 1-5 in ARIC 28 ; screening and annual visits 3, 6, and 9 in WHI 29 ; and, for this analysis, the baseline visit in MESA 24 , CHS 25 , and HCHS/ SOL 26 . Technicians used comparable procedures for preparing participants, placing electrodes, recording ECGs with MAC PC electrocardiographs (GE Marquette Electronics, Inc., Milwaukee, WI), and telephonically transmitting them to the Epidemiological Cardiology Research Center (Wake Forest School of Medicine, Winston Salem, NC) for inspection, identification of technical errors/inadequate quality, and analysis using the Marquette 12-SL program (2001 version, GE Marquette, Milwaukee, WI).

Identification of Supraventricular and Ventricular Ectopy.
Since SVE and VE often occur intermittently and in isolation, presence of each phenotype on the ECG was determined independently at each visit. Supraventricular and ventricular ectopic beats were separately detected by computer algorithms based on the Minnesota Code (MC) and visually over-read by physicians (ARIC, WHI, MESA, HCHS/SOL). SVE was defined as ≥1 supraventricular ectopic beat (MC8. Genotyping, Quality Control, and Imputation. Each cohort or study performed genome-wide genotyping using Affymetrix or Illumina arrays and used similar quality control thresholds for excluding SNPs and samples (Supplementary Table S1). Genotypes were imputed using HapMap 2, HapMap 2 and 3, or 1000 Genomes Phase 1 (version 3, March 2012 release) reference panels. To enable cross-platform comparisons, Build 36 coordinates were converted to Build 37, and analyses were restricted to SNPs present in HapMap 2.
Statistical analysis. We stratified cohort participants by ancestry (and study) into thirteen subgroups of European (ARIC, CHS, MESA, WHI-GARNET, WHI-MOPMAP, WHI-WHIMS), African (ARIC, CHS, MESA, WHI-SHARe), and Hispanic/Latino (MESA, HCHS/SOL, WHI-SHARe) descent. For each of the thirteen ancestry-stratified subgroups, GWA analyses followed a standard protocol leveraging the availability of repeat ECGs, when available, to increase power. In cohorts with multiple ECGs per participant over time (ARIC, WHI), we estimated ectopy-SNP associations using generalized estimating equation (GEE) methods 30 , a logit link, and an exchangeable working correlation structure to control for correlation of repeated measures (R geepack package). In studies with one ECG per participant (MESA, CHS), we estimated associations using logistic regression (SNPTEST, R geeglm package). Though multiple ECGs were available in MESA and CHS, only baseline visit data were used in accordance with analytic pipelines. In HCHS/SOL, we estimated associations among unrelated (at the 3 rd degree level) participants (one per household) using a generalized linear model and a Firth test 31 to account for small numbers of cases (R logistf package), assuming Census block group effects were negligible. We adjusted all models for age (year), sex (studies containing >1), season (quarter), study center (ARIC, CHS, MESA, HCHS/SOL) or geographic region (WHI), and ancestry principal components estimated using Eigenstrat 32 (ARIC, CHS, MESA, WHI) or PC-AiR 33 (HCHS/SOL).
Within subgroups, we compared observed P-values for each SNP with expected values from a χ 2 distribution using quantile-quantile (Q-Q) plots and genomic inflation factors (lambda). To eliminate statistical artifacts at low allele and ectopy frequencies, the comparisons excluded SNPs with an effective number of minor alleles present in exposed participants (defined as 2 × number of exposed participants × minor allele frequency × imputation quality) <10 or a log odds ratio >10. After filtering, thirteen and twelve subgroups contributed to the SVE and VE meta-analyses (MESA Hispanic/Latinos did not meet filtering thresholds due to infrequency of VE; n = 17).

Meta-analysis.
We prioritized trans-ethnic analyses to maximize power and generalizability, given previous research suggesting that causal variants are typically relevant across populations 34 , but also conducted ancestry-specific analyses given the potential for differences in linkage disequilibrium (LD) and allele frequency among populations. Analyses involved combining subgroup-and ancestry-specific summary results in 1) fixed-effects, inverse-variance-weighted meta-analyses with genomic control (METAL) and 2) trans-ethnic Bayesian meta-analysis (MANTRA) 35 to account for allelic heterogeneity among ancestry groups. MANTRA clustered similar populations according to allele frequencies, allowed for varying allele effects across populations, and produced Bayes' factors (BFs) for each ectopy-SNP association and its posterior probability of heterogeneity (P het ). We also performed multi-trait SNP association analyses that combined t-test statistics from fixed-effects meta-analyses of SVE and VE, using adaptive sum of powered score (aSPU) methods to investigate potential pleiotropy 36 . While etiologies of SVE and VE may differ, combination was justified by extant knowledge of their shared precipitants 14,15 , potential co-occurrence (MC 8.1.3 or 8.1.5) 1 , and difficulty distinguishing them from each other 37 . Multi-trait analyses provided P-values for genetic correlations among traits, but no effect estimates. By convention, we set genome-wide significance at P < 5.0 × 10 −8 and suggestive significance at P < 2.5 × 10 −6 for fixed-effects meta-analyses. For Bayesian meta-analyses, we used a log 10 BF ≥6.0 as a genome-wide threshold for discovery (to approximate the performance of a P < 5.0 × 10 −8 ) 38 , a P het < 0.5 as a liberal indicator of homogeneity among subgroups, and ≥ two contributing racial/ethnic groups as a threshold for performing meta-analysis. Suggestive SNPs had log 10 BF ≥5.0, P het <0.5, and ≥ two contributing racial/ethnic groups. We report sub-threshold hits for trans-ethnic meta-analyses because they had the largest number of participants. We considered SNPs with ancestry-specific LD r 2 < 0.2 as independent. We summarized results from genomically controlled meta-analyses in Q-Q plots, Manhattan plots of the -log 10 P value versus SNP position, and regional association plots. We functionally annotated lead and correlated SNPs (r 2 ≥ 0.8) in relevant cardiac tissues using HaploReg v4.1 39 and visualized relevant tracks using the UCSC Genome Browser. We estimated heritability in European ancestry populations (ARIC, WHI-MOPMAP) using Genome-wide Complex Trait Analysis 40 .

Data availability.
Complete results are available on dbGAP at https://www.ncbi.nlm.nih.gov/projects/gap/ cgi-bin/study.cgi?study_id=phs000930.v5.p1. Primary data are available from the parent studies conditional on review and approval of requests by cohort-specific presentation and publication committees.

Results
Study characteristics. A total of 42,976 participants in thirteen subgroups contributed to the SVE analysis, of whom 22% were of African ancestry, 26% Hispanic/Latino, and 76% female (Table 1). On average, these participants were aged 66.3 years and contributed 2.2 visits (range:1-5), at which 2-10% of them had SVE at one or more visits. Estimated heritability (standard error(SE)) of SVE in ARIC was 3.2% (3.4%). A total of 44,131 participants in twelve subgroups contributed to the VE analysis, of whom 21% were of African ancestry, 25% Hispanic/ Latino, and 74% female. On average, these participants were aged 67.7 years and also contributed 2.2 visits, during which 1-8% had VE at one or more visits, except in MOPMAP, which sampled VE cases and controls in equal proportions. Baseline prevalence of VE was <3% in all subgroups, except in MOPMAP. Heritability of VE in ARIC and WHI-MOPMAP were 9.4% (3.4%) and 32% (14%). Lambdas from subgroup-specific Q-Q plots of SVE and VE ranged from 0.99 to 1.04 (Supplementary Figs S1 and S2).

Trans-ethnic meta-analyses.
No SNP associations exceeded a genome-wide threshold for SVE or VE in trans-ethnic, fixed-effects meta-analyses; however, sub-threshold associations were identified for both  Table S2). Furthermore, Bayesian and multi-trait analyses (not shown) did not identify trans-ethnic loci ( Supplementary Fig. S3).
Ancestry-specific meta-analyses: European. There were no genome-wide significant associations in fixed-effects meta-analyses of European ancestry studies of SVE or VE. However, multi-trait analysis identified a locus on chromosome 1 jointly associated with SVE and VE (P = 2.0 × 10 −8 ; Panels A,B in Fig. 1 S4). Rs7545860, and correlated SNPs (r 2 ≥ 0.2) including rs72692218 and rs66462949, reside in a genomic region including deoxyribonuclease (DNase I) hypersensitive sites, regulatory motifs, and putative enhancer/promoter histone signals in fetal heart, right atrium and ventricle, and/or aorta ( Supplementary Fig. S5). This locus may also include the epidermal growth factor receptor pathway substrate 15 (EPS15) gene through SNPs in LD with rs7545860 (rs17106627 and rs12022046) (Supplementary Fig. S6). These SNPs are also in regions containing DNase I hypersensitivity sites, DNA methylation sites, putative enhancer/promoter histone marks, and regulatory motifs in cardiomyocytes and cardiac fibroblasts ( Supplementary Fig. S7).
Ancestry-specific meta-analyses: African. Among African ancestry participants, fixed-effects meta-analysis of SVE identified a novel signal on chromosome 18 (lead SNP rs8086068; EAF = 0.17; P = 2.87 × 10 −9 ), associated with a 75% increased odds of SVE per copy of the C allele (95% CI: 1.46-2.11) (Panel D in Fig. 1). This variant also was directionally consistent, if not significant, among European ancestry studies and one Hispanic/Latino ancestry subgroup (Supplementary Fig. S8). Multi-trait analyses identified this same lead SNP (P = 4.0 × 10 −8 ), driven by its association with SVE (Table 2; Panel F in Fig. 1). While intergenic, rs8086068 is 206 kb 3′ from the desmocolin 3 (DSC3) gene, one of a family of desmocolin genes clustered in the area, though the SNP is separated from the gene family by a recombination spike and may not interact with it ( Supplementary  Figs S8, S9). Functional annotation indicates that three SNPs in LD (r 2 ≥ 0.2) with this lead SNP using the 1000G AFR referent population (rs2097047, rs17711533, rs17711559) occur within DNase I hypersensitivity sites in fetal heart tissue ( Supplementary Fig. S10). No SNPs met the genome-wide threshold for significance among African ancestry studies in fixed-effects meta-analyses of VE (Panel E in Fig. 1).

Discussion
This first GWAS of ectopy identified two biologically plausible loci among European and African ancestry individuals. It identified the FAF1/CDKN2C/EPS15 locus (chromosome 1) in multi-trait meta-analyses of SVE and VE among European ancestry individuals. Earlier GWAS have associated this locus with QRS duration 41 . It also identified a second locus among African ancestry individuals, approximately 206 kb 3′ from a desmocolin gene cluster that includes DSC3 and DSC2, the latter previously associated with arrhythmogenic cardiomyopathy (ACM) 42 . Together, these findings provide insight into putative mechanisms underlying genetic susceptibility to ectopy. Contrary to expectation, this GWAS of ectopy did not identify any loci meeting the genome-wide threshold for significance in trans-ethnic, fixed-effects or Bayesian meta-analyses of either phenotype. Restriction of analyses to HapMap 2 SNPs may be one reason why none were identified, given the admittedly limited genomic coverage of this reference panel, although restriction also enabled cross-platform comparisons. Heterogeneity of association among races/ethnicities due to differences in imputation quality or minor allele frequency may be another. Lastly, as large as our study is, an even larger study may be required to adequately power the identification of trans-ethnically important variants, as is further discussed, below.
The European ancestry locus identified by rs7545860 is intronic to FAF1, an apoptosis protein-encoding gene previously implicated in GWAS of QRS duration 43 . Two SNPs in LD with rs7545860 (rs72692218; rs66462949) are intronic to the nearby gene CDKN2C, a cyclin-dependent kinase inhibitor dually implicated by that GWAS. The lead SNP is also in LD with rs17391905, a FAF1 and CDKN2C SNP identified by Sotoodehnia et al. (r 2 = 0.53, multi-trait aSPU P = 1.60 × 10 −7 ). Additional SNPs in LD with rs7545860 include intronic variants (rs17106627, rs12022046) of EPS15, a gene that encodes a calcium-binding protein involved in receptor-mediated endocytosis of epidermal growth factor, but has no previously established role in arrhythmogenesis. Functional annotation for these SNPs suggests potential involvement with histone modification and enhancer activity in fetal heart.
It is notable that the aforementioned European ancestry locus (FAF1/CDKN2C/EPS15) was only identified when using adaptive sum of powered score methods to investigate pleiotropy. This finding highlights the potential value of leveraging pleiotropic effects in future studies of ectopy-related phenotypes. Indeed, examining them may well improve understanding of biological mechanisms underlying correlated traits.
No GWAS has been published to date relating arrhythmia to genetic variation in desmocolin cluster genes, including DSC3. The desmocolin gene cluster is of interest because the desmocolins are calcium-dependent glycoproteins involved in cardiac intercellular connections and neighboring gene DSC2 is associated with ACM, a congenital heart disorder characterized by right ventricular fibrofatty infiltration, myocardiocyte apoptosis, gap junction pathophysiology, supraventricular/ventricular arrhythmias, and sudden cardiac death 42 . Moreover, several SNPs in LD with lead SNP rs8086068 are located within DNase I hypersensitivity sites in fetal heart tissue, suggesting potential involvement in tissue-specific regulation. We also demonstrated that this variant was directionally consistent among European ancestry studies and one Hispanic/Latino ancestry subgroup, suggesting that differences in risk factors or allelic effects among races/ethnicities may explain the ancestral heterogeneity of effects, a possibility deserving further study. Several other loci that reached the threshold for suggestive significance in trans-ethnic meta-analyses also have biologically plausible relationships with ectopy (Supplementary Discussion).
In addition to the loci discussed above, this paper adds to the literature an estimate of heritability for SVE and VE in European ancestry populations. In lieu of available family-based data, we estimated heritability using two cohorts with the largest number of ectopy cases (ARIC; WHI-MOPMAP) and among European ancestry participants because of the difficulty obtaining minority reference populations. Our finding that the estimated heritability of VE, a binary phenotype, differed in ARIC (9.4%, SE = 3.4%) and WHI-MOPMAP (32%, SE = 14%) is partly attributable to the difference in VE prevalence (and design) between those populations 44,45 . In our study, WHI-MOPMAP sampled VE cases and controls in equal proportions (i.e. 50% of participants had ectopy), but among ARIC European ancestry participants, the prevalence of VE was only 7.9%. The SVE heritability estimate was likely influenced by the same factors. Although the generalizability of such estimates outside of ARIC and WHI-MOPMAP is unknown and the estimates are not directly comparable to those estimated from pedigree data 44,45 , they remain notable findings from this study.
This work has several limitations that deserve consideration. Despite many participants in this meta-analysis, low prevalence of ectopy as measured by brief ECGs and limited genomic coverage of HapMap 2-especially in non-European ancestry populations-limited its overall power to identify trans-ethnic signals. By extension, our ability to detect ancestry-specific signals also was limited. Modest power is a well-known limitation of GWAS involving small populations, cross-sectional designs, infrequent outcomes, and brief ECG recordings. However, ectopy has not been examined in a multi-ethnic GWAS, so to examine it, we leveraged the following: (1) imputed genomic data from five cohorts including multiple ancillary studies and thirteen ancestry-stratified subgroups collectively representing >42,000 participants; (2) ECG data from up to five recordings per participant and eleven years of follow-up; (3) relatively powerful, longitudinal and meta-analytic methods that exploit ancestral heterogeneity 35 ; and (4) multi-trait SNP association methods that exploit phenotypic correlation 36 . Leveraging multi-ethnic cohorts and these analytical methods powered the discovery and localization of ectopy-SNP associations, albeit based on ten-second ECG recordings.
We acknowledge that longer ECG recording durations are essential for detecting ectopy with sensitivity. Although the relative sensitivity of short ECG recordings for ectopy is low-even when repeated-paroxysmal arrhythmias frequent enough to be captured by insensitive, but highly specific, short recordings may have more prognostic significance than those so infrequent that they require long recordings to capture them 15 . Moreover, the bias of odds ratios reported here approaches zero because specificity of physician-verified ECGs for ectopy approaches 100% 46 while their sensitivity among participants with and without a given variant is identical 47 . It is also possible that short ECG recordings capture frequent ectopy known to increase the risk of myocardial infarction, cardiac, and all-cause mortality in addition to infrequent ectopy associated with a relatively benign prognosis. The group to whom inferences can be made may therefore be heterogeneous. Finally, because independent replication was not feasible due to the current paucity of genotyped cohorts with physician-verified ectopy, we accordingly acknowledge that findings may be due to chance. These considerations underscore the need for further confirmation of our findings.

Conclusions
Given these limitations, we view the study findings as hypothesis-generating and have provided publicly accessible summary statistics from ancestry-specific fixed-effects meta-analyses on dbGAP to facilitate external replication. But under those hypotheses, we also provide evidence that variants in FAF1/CDKN2C, EPS15, DSC2/3, and SCN5A on chromosomes 1, 3, and 18 contribute to the genetic risk of supraventricular/ventricular ectopy and arrhythmogenesis in humans via plausible cellular, intercellular, and cationic mechanisms involving myocardiocyte apoptosis, desmosome-related gap junction abnormality, sodium channelopathy, and electrocardiographically manifest derangement of normal atrioventricular physiology.