Introduction

The QRS complex (Fig. 1) of the electrocardiogram (ECG) is a summation of electrical activity in the heart during ventricular depolarisation. It represents electrical activation of the left and right ventricles of the heart, propagated through the specialised conduction system1, which includes the His bundle, right and left bundle branches and the Purkinje network. Normal ventricular depolarisation is a rapid process activating different ventricular myocardial segments in a precise temporal sequence, resulting in a narrow QRS complex with a characteristic pattern on the 12-lead ECG1. Deviation from the norm has been associated with sudden cardiac arrest (prolonged ventricular activation time)2,3 and mortality in the general population (prolonged QRS duration)4 and among individuals free of cardiovascular disease (low QRS voltage)5.

Fig. 1
figure 1

ECG wave and the QRS complex. We denote QRS measures used in the analysis with grey lines. We illustrate the different viewpoints of the 12 ECG leads in the upper-right corner. Amp. amplitude, Dur. duration, VAT ventricular activation time

Many traits and diseases affect the morphology of the QRS complex, including body habitus, primary conduction abnormalities, hypertrophy and dilatation of the ventricles, myocardial infarction, pericardial effusion and lung disease6. Measures of the QRS complex have been used as prognostic indicators and markers of heart disease severity, such as heart failure7, ventricular hypertrophy8 and amyloidosis9.

Prior genome-wide association studies (GWAS) of the QRS complex were limited to testing QRS duration and voltage criteria reflecting left ventricular hypertrophy, yielding sequence variants at 58 loci with minor allele frequency (MAF) >4%10,11,12,13,14,15.

To gain a better understanding of the molecular mechanism of ventricular conduction, we perform a large GWAS of ten measures of the QRS complex in 12 leads from ECGs of 81,192 individuals. These ten measures are the Q, R and S wave amplitudes and durations, QRS complex area and duration, ventricular activation time, and the distance from the peak of the R wave to the nadir of the S wave (QRS p-2-p). We analyse each measure for 12 ECG leads: limb leads (I–III), augmented limb leads (aVR, aVL, aVF) and precordial leads (V1–V6). We also analyse three other parameters: mean QRS duration over 12 leads, the Cornell voltage criterion and the Sokolow-Lyon voltage criterion. In total, we analyse 123 QRS complex parameters and this results in the identification of 190 associations between the QRS complex and variants at 130 loci, for which we further assess effects on seven echocardiographic traits and 22 cardiovascular diseases. We demonstrate the advantage of analysing 12 leads of the ECG and identify rare protein-coding variants in genes that can improve the understanding of risk and progression of heart disease and help direct future studies.

Results

QRS associations

We performed a GWAS on 81,192 individuals, testing 32.5 million common and rare (MAF > 0.01%) SNPs and indels for association with 123 QRS complex parameters (Supplementary Fig. 1, Supplementary Tables 1 and 2). The variants were identified by whole-genome sequencing 15,220 Icelanders and imputed into 151,677 long-range phased individuals and their relatives16. We adjusted the threshold for genome-wide significance with a weighted Bonferroni procedure, using as weights the predicted functional impact of association signals17 (Supplementary Table 3). Sequence variants at 130 loci (MAF > 0.1%) associate with at least one QRS complex parameter. Using conditional analysis, we identified 60 secondary signals at 32 of the loci, resulting in 190 distinct associations (Supplementary Data 1 and 2). We replicated associations at all 54 reported QRS loci with at least one of the 123 QRS parameters (Supplementary Data 3, Supplementary Table 4; P < 0.05/123 = 4.1 × 10−4).

Among the 190 variants that associate with QRS parameters, 159 are common (MAF > 5%) with effects ranging in magnitude from 0.04 to 0.14 standard deviations (s.d.), 14 are low-frequency (1% < MAF ≤ 5%) with effects ranging from 0.11 to 0.35 s.d., and 17 are rare (MAF ≤ 1%) with effects ranging from 0.19 to 1.9 s.d.

We found genome-wide significant associations with 115 of 123 QRS parameters tested. There were no genome-wide significant associations with eight Q wave amplitude and duration parameters, of which four had relatively few available measurements (N < 15,000). Of the 190 variants, 160 associate genome-wide significantly with more than one QRS parameter. The R amplitude is the QRS measure that captures the most associations (N = 96), followed by QRS area (N = 85) and R duration (N = 76), while the Cornell voltage criterion captures the fewest (N = 18) (Supplementary Fig. 2). Some of the variants associate only with the R amplitude (N = 16), R duration (N = 9), S amplitude (N = 9), QRS area (N = 5), QRS p-2-p (N = 4) or ventricular activation time (N = 4).

For the R amplitude, we identified most associations with lead V1 (N = 47, Fig. 2). Fifteen variants associate only with the R amplitude in lead V1. We identified more than twice as many associations for the QRS p-2-p amplitude in lead aVR than in any other lead (N = 47). For the QRS area, we identified most associations with lead II (N = 43); with lead V3 for QRS duration (N = 39); with lead V5 for R duration (N = 33), S duration (N = 33) and S amplitude (N = 31); and with lead V6 for ventricular activation time (N = 28), Q amplitude (N = 24) and Q duration (N = 9). Across all QRS measures, we observed the fewest associations with leads III and aVL.

Fig. 2
figure 2

Intersection plots of associations for the QRS measures. In each panel, the left bar plot shows how many of the 190 QRS variants associate (P < 5 × 10−8) with each lead. The top bar plot shows the sizes of intersecting sets of associations for the leads denoted with (connected) black dots. a Q amplitude; b Q duration; c Ventricular activation time; d R amplitude; e S amplitude; f R duration; g S duration; h QRS area; i QRS duration; j QRS peak-to-peak

The genes harbouring the missense and loss-of-function variant associations likely encode proteins that play roles in the biology of the QRS complex. Variants annotated as coding/splice represent 47 of the 190 QRS associations, of which 28 are at unreported QRS loci—14 rare and 7 low frequency (Supplementary Data 1 and 2). Loss-of-function variants in HMCN2 and SH3BGR associate most significantly with the QRS p-2-p amplitude, in MYBPC3 and TTN with R wave duration and SCN5A with QRS duration. MYBPC3, TTN and SCN5A are established cardiac disease genes, but SH3BGR and HMCN2 are not. Of the missense variants, 20 associate most significantly with the R amplitude (in ADAMTS7, ALPK3, BAG3, CASP7, CCDC141, CILP, DERL3, FHOD3, FLNC, MYH6, MYH7, MYH7B, MYO18B, NACA, PLEC, RYR2, STAB1 and TTN), 4 with the QRS area (in ALDH1A2, RBM20 and TTN) and 4 with the QRS p-2-p amplitude (in CAND2, HMCN2, SENP2 and STON1-GTF2A1L). Twelve variants associate most significantly with other QRS parameters (in ADAMTS6, ADPRHL1, C17orf58, CCDC141, CFAP46, KLHL38, LAMA3, MAPT, PLCE1, SCN10A, SYNPO2L and TTN).

We tested the 190 QRS variants for association with the ventricular conduction disorders, left anterior/posterior fascicular block (LAFB/LPFB) and left/right bundle branch block (LBBB/RBBB). Using a 5% false discovery rate (FDR, P ≤ 0.003), we observed 46 associations with 37 variants (Supplementary Data 4). Three variants associate genome-wide significantly with LAFB, including the missense variant p.Leu294Arg (MAF = 4.4%) in ADPRHL1 (OR = 1.37, P = 4.8 × 10−8), coding for a cardiac-restricted protein essential for heart chamber outgrowth18. The stop-gain variant p.Ser699Ter in HMCN2 (MAF = 2.0%) associates with a high risk of LPFB (OR = 2.31, P = 5.2 × 10−4).

To assess the genetic relationship between ventricular depolarisation and other electrical functions of the cardiac conduction system, we tested the genetic correlation between different ECG measures. The genetic correlation (Methods) between mean QRS duration and mean PR interval duration is 0.14 (95% CI: 0.06–0.21) and between mean QRS duration and mean QT interval duration is 0.31 (95% CI: 0.23-0.40). We also tested the 190 QRS variants for association with ECG measures reflecting atrial and atrioventricular conduction (P–Q component of the ECG), and ventricular repolarisation (ST–T component) (Supplementary Fig. 3, Supplementary Data 5). Using a 5% FDR (P ≤ 0.016), 177 of the variants associate with a non-QRS measure of the ECG. Of those, 16 have more significant effects on non-QRS components: the PR interval (at CAV1, OBFC1, SCN5A, SCN10A and TBX3), T amplitude (at KLF12, LMF1, MYBPC3 and RNF207), P amplitude (at SYNPO2L and MYH6) and ST duration (at MYH7B and RNF207). The pattern of magnitude and direction of effects associated with different ECG parameters is highly variable between variants.

Replication

We assessed the associations of unreported QRS variants in (1) QRS parameters similar to ours from 19,885 ECGs of participants in the UK Biobank and (2) a published GWAS14 on four QRS measures from up to 73,518 participants.

Using data from the published GWAS on QRS duration and three voltage criteria14, we tested variants that associated genome-wide significantly in the Icelandic material with any of these four parameters. We tested seven variants and all replicated (P < 0.05 and the same direction of effects, Supplementary Data 6). In the smaller UK Biobank data, we tested variants that we expected to replicate (over 99% power). Of 13 variants tested, all had the same direction of effects in both datasets and 11 replicated (P < 0.05 and the same direction of effects, Supplementary Data 7).

Associations with echocardiographic traits

Using echocardiograms of 21,275 Icelanders, we tested the QRS variants for association with 7 echocardiographic left ventricular measurements (Supplementary Table 5, Supplementary Data 8). We observed 23 associations with 18 variants (Table 1, 5% FDR, P ≤ 7.1 × 10−4), most with the left ventricular end-diastolic diameter (LVEDD). The splice-acceptor site mutation c.927-2 A > G in MYBPC3, known to cause familial hypertrophic cardiomyopathy19, associates with interventricular septum thickness (β = 1.2 s.d., P = 1.5 × 10−37), left ventricular posterior wall thickness and LVEDD. The missense variant p.Cys151Arg in BAG3, known to associate with heart failure due to dilated cardiomyopathy20, associates with LVEDD (β = −0.10 s.d., P = 1.8 × 10−13). We discovered ten unreported associations, including an association with rs4794562[T] intronic to HLF, encoding a circadian PAR bZip transcription factor21, and LVEDD (β = −0.06 s.d., P = 5.3 × 10−6).

Table 1 Associations between the QRS variants and echocardiographic traits

Associations with cardiovascular diseases

The 190 QRS variants are a priori more likely to associate with cardiovascular diseases than random sequence variants, and thus we tested them for association with 22 cardiovascular diseases in Icelandic datasets, including cardiac arrhythmias, cardiomyopathy and coronary artery disease (CAD). For 18 of the diseases, we also had data from the UK Biobank22 and performed a meta-analysis (Supplementary Table 6, Supplementary Data 9).

Two rare QRS variants associate with several cardiovascular dysfunctions. C.927-2A>G in MYBPC3 (MAF = 0.18%) associates with cardiomyopathy, heart failure, ventricular tachycardia and atrial fibrillation (AF), and p.Arg721Trp (MAF = 0.34%) in MYH6 with sick sinus syndrome (SSS), AF and coarctation of the aorta. We have previously reported the effects of these variants on risk and progression of heart diseases, as well as two other QRS variants at PLEC and PALMD19,23,24,25,26. Excluding these four variants, we observed 91 associations with 57 of the remaining QRS variants (5% FDR, P ≤ 1.0 × 10−3). Most of the associations are with AF (N = 30), essential hypertension (N = 11), CAD (N = 10) and myocardial infarction (N = 10), of which many have been reported27,28,29,30,31,32,33,34. We also identified reported associations with dilated cardiomyopathy and heart failure20, SSS and presence of pacemaker10.

We found 41 unreported associations with 32 variants and 16 cardiovascular traits (Table 2). Notably, the intronic variant rs11166990[A] in PXN, which associates with the R amplitude in lead V1, associates with reduced risk of AF (MAF = 1.3%, OR = 0.82, P = 4.0 × 10−8). Paxillin, encoded by PXN, is expressed at focal adhesions of non-striated cells and costameres of striated muscle cells35. Rs11166990[A] upstream of PTK2 (coding for focal adhesion kinase), which also associates with the R amplitude in lead V1, associates with hypertension (MAF = 48%, OR = 0.97, P = 1.1 × 10−6), aortic valve stenosis (OR = 0.92, P = 2.1 × 10−4), and is known to associate with reduced risk of AF. The rare missense variant p.Ala2397Val in FLNC (MAF = 0.13%), which associates with a larger R amplitude in lead V1, reduces the risk of myocardial infarction (OR = 0.42, P = 1.3 × 10−4) and delays its onset (β = 0.78 s.d., P = 1.1 × 10−4), with greater protection against myocardial infarction in individuals younger than 76 years (OR = 0.21, P = 3.8 × 10−6). Variants in FLNC are known to cause cardiomyopathy, and we recently reported the frameshift variant p.Phe1626Serfs*40 in FLNC36 with opposite direction of effects on the R amplitude in lead V1 and myocardial infarction to those of p.Ala2397Val (Supplementary Table 7).

Table 2 Unreported associations between the QRS variants and cardiovascular diseases

We found four unreported associations with supraventricular tachycardia (SVT), represented by rs75013985[G] intronic to KCND3 (AF = 2.2%, OR = 1.61, P = 2.7 × 10−9); rs629234[T] upstream of PRRX1 (MAF = 47.9%, OR = 1.11, P = 8.6 × 10−6), which also associates with AF and CAD; rs12144451[C] in CASQ2 (MAF = 43.1%, OR = 0.92, P = 4.4 × 10−4), which also associates with AF; and the missense variant p.Arg935Trp in CCDC141 (MAF = 11.7%, OR = 1.14, P = 7.9 × 10−4). We found three associations with complete AV block (CAVB), represented by p.Glu382Asp in CCDC141 (MAF = 15.3%, OR = 1.29, P = 3.6 × 10−7), which also associates with pacemaker insertion (OR = 1.12, P = 5.9 × 10−5) and SSS (OR = 1.13, P = 2.1 × 10−4); rs4794562[T] intronic to HLF (MAF = 25.7%, OR = 0.86, P = 7.8 × 10−4), which also associates with heart failure (OR = 0.95, P = 6.7 × 10−5); and rs17226667[A] near CEP85L (MAF = 45.8%, OR = 1.13, P = 8.1 × 10−4). We found three unreported associations with dilated cardiomyopathy, represented by the missense variant p.Val77Ala in CAND2 (MAF = 32.7%, OR = 0.82, P = 3.9 × 10−5), rs35006907[A] near ZNF572 (MAF = 35.8%, OR = 0.82, P = 7.0 × 10−5), and rs10838681[A] upstream of NR1H3 (MAF = 31.8%, OR = 1.19, P = 3.1 × 10−4).

Some of the QRS variants that associate with AF have effects on non-QRS components of the ECG, mainly the P amplitude and the PR interval, measuring atrial and atrioventricular nodal conduction (Supplementary Fig. 3). We tested these QRS variants in a subset of our data containing 236,896 ECGs in sinus rhythm of 57,399 individuals without AF. Most variants still had the strongest effects on the QRS complex and non-significant effects on the P amplitude (5% FDR), indicating that the QRS effect is not a consequence of the arrhythmia (Supplementary Fig. 4).

Parent-of-origin effects

We tested the 190 QRS variants for a difference in effects under maternal and paternal models of transmission (Supplementary Table 8)37. Two variants have more than three times greater effect when inherited paternally than maternally (Table 3, P < 0.05/190 = 2.6 × 10−4). One is rs11761424[A] (MAF = 35%) intronic to DGKB (encoding a diacylglycerol kinase), which associates most significantly with R wave duration in lead V4 (βpat = 0.068 s.d., βmat = 0.018 s.d., Phet = 4.1 × 10−5). Sequence variants near DGKB are known to associate with type 2 diabetes38 and AF27 but do not correlate with rs11761424 (r2 < 0.01). The other is rs116904997[A] (MAF = 1.3%) intronic to PXN (encoding paxillin), which associates most significantly with the R amplitude in lead V1 (βpat = 0.28 s.d., βmat = 0.073 s.d., Phet = 1.6 × 10−4). Paxillin is a cytoskeletal protein involved in actin-membrane attachment at sites of focal adhesion35. In the Genotype-Tissue Expression (GTEx) dataset39, rs116904997[A] associates more significantly than any other variant with the expression of PXN in the left ventricle of the heart (Supplementary Fig. 5, P = 3.0 × 10−9). These loci are not known to be imprinted.

Table 3 QRS variants with parent-of-origin effects

Pathway analysis

We applied the data-driven expression-prioritised integration for complex traits (DEPICT, Methods)40 to search for tissues and cell types in which our variants are more likely to be expressed. Enrichment testing of expression in 209 tissues and cell types pointed out cardiovascular tissue (atria, ventricles and arteries) as well as the myometrium and muscles (1% FDR, Supplementary Data 10). Furthermore, DEPICT identified 495 enriched gene sets out of 14,461 reconstituted gene sets (1% FDR, Supplementary Data 11). After clustering gene sets by similarity, the most significant gene sets related to abnormal myocardium layer morphology, pericardial effusion, abnormal heart morphology and the cell-substrate adherens junction. Many genes near unreported QRS variants were part of gene sets linked to the adherens junction, including VCL, SMTN, PXN, FHL2, SVIL and ANKRD1. We also used DEPICT to point out genes within each associated locus based on functional similarity to genes within other associated loci (5% FDR, Supplementary Data 12).

Of the 190 QRS variants, 20 are intronic or coding in genes with higher expression in heart muscle than other tissue types in the Human Protein Atlas (total 201 genes)41. Variants in some of these genes have been described by OMIM as causing autosomal dominant heart conditions (MYBPC3, MYH6, MYH7, CTNNA3, RBM20, RYR2, SCN5A) or associated in GWAS with heart diseases (CASQ2, MYH7B, MYO18B, SYNPO2L, TBX5, TRIM63)27,42,43, resting heart rate (CCDC141, FHOD3)44 or non-QRS ECG parameters (ALPK3, RNF207)45,46. Other genes have not been implicated in heart diseases (ADPRHL1, KLHL38, SH3BGR). Given both the QRS association and the heart muscle expression, variants in these genes could substantially affect cardiac function.

Using the prior information on expression of genes in the heart, we searched for additional rare (MAF ≤ 1%) coding variants that associate with the QRS complex in these 201 genes. Thirteen additional variants associate with one or more QRS parameters (Supplementary Data 13, P < 0.05/(number of coding variants tested) = 2.4 × 10−5). We found four missense variants at unreported QRS loci: C10orf71, FBXO40, GRM1 and HCN4. We note that HCN4 is a known bradycardia gene47. We identified one frameshift and three missense QRS variants where we had identified non-coding QRS variants previously in our study: in CTNNA3, NKX2-5, RNF207 and TBX5. We recently reported the missense variant p.Phe145Leu in NKX2-5 as causing cardiomyopathy and arrhythmias36.

Discussion

We used whole-genome sequence data to perform a GWAS on the QRS complex of the 12-lead ECG in 81,192 Icelanders and identified 130 QRS loci. We found 190 distinct QRS variants, of which 106 are at 86 loci that have not been reported for QRS before, and assessed their effects on seven echocardiographic traits and 22 cardiovascular diseases.

We demonstrate an advantage of performing an in-depth examination of the QRS complex when studying the genetic influence on ventricular depolarisation, myocardial mass and associated cardiac disease. Our approach yielded 130 QRS loci, compared to 52 loci identified in a recent GWAS of the QRS complex14 of a similar size (up to 73,518 individuals) that assessed only the QRS duration and three voltage criteria, measures commonly used clinically. If we had exclusively tested these same four measures, we would have identified associations at only 54 loci. Thus, additional measures of the QRS complex (the QRS area, ventricular activation time and distinct measures of the Q, R and S waves) yielded 76 additional loci.

Many of the 123 parameters tested are correlated, and most variants associate with more than one. However, some variants associate with only one QRS measure, some with only one lead. The R amplitude is the QRS measure that captures most QRS associations, about half of them. The R wave in lead V1 captures half of the R amplitude associations, and 15 variants associate only with the R amplitude in lead V1. Under normal conditions, the R wave is small in lead V1 but gets progressively larger in the precordial leads, usually reaching maximum amplitude in lead V5. The small R wave in lead V1 represents the initial part of ventricular depolarisation. Some conditions may cause an abnormally large R wave in lead V1, including right bundle branch block, type A Wolf–Parkinson–White Syndrome, posterior myocardial infarction, right ventricular hypertrophy and acute right ventricular dilatation48. The QRS area captures the second highest number of associations. Although the QRS area has not been used in a GWAS before, it improves identification of left ventricular hypertrophy over standard voltage criteria49. The genetic influence on various aspects of the cardiac conduction system is both tightly linked and complex, as the majority of QRS variants associates also with ECG measures reflecting atrial and atrioventricular conduction, and ventricular repolarisation, but with a variable magnitude and direction of effects.

Before our study, only two low-frequency (1% ≤ MAF ≤ 5%) and no rare variants had been shown to associate with the QRS complex in a GWAS14. We found 33 unreported large-effect associations with rare or low-frequency coding variants, including 13 associations identified by using a priori expression information. This includes associations with a missense variant in ADPRHL1 that also associates genome-wide significantly with left anterior fascicular block, and a stop-gain variant in HMCN2 that also associates with a high risk of left posterior fascicular block. The rare missense variant p.Ala2397Val in FLNC associates with a larger R amplitude in lead V1 and reduced risk of myocardial infarction, with opposite direction of effects to those of p.Phe1626Serfs*40, a reported frameshift variant that causes cardiomyopathy, thus suggesting an opposite functional effect of p.Ala2397Val36. FLNC encodes filamin C, a large actin-crosslinking protein expressed in striated muscles. Filamin C anchors major protein complexes at the sarcolemma, Z-discs and intercalated discs in cardiomyocytes to the actin cytoskeleton and provides a scaffold for a variety of cytoplasmic signalling proteins50,51.

We identified 91 associations for 57 of the QRS variants with a spectrum of cardiovascular diseases, including arrhythmias and conduction disorders, hypertension, ischemic disease, cardiomyopathies, and valve diseases. We found 41 unreported associations, including four with supraventricular tachycardia (SVT), one of which is with a low-frequency variant located near KCND3, a potassium channel gene that has been implicated in Brugada syndrome52 and AF27, and is the first genome-wide significant SVT association.

We found an association with an intronic variant in PXN, encoding paxillin, which has not been associated with AF before. Paxillin is expressed at focal adhesions of non-striated cells and costameres of striated muscle cells35, and regulates cardiac contractility in the zebrafish53. The PXN variant, as well as a variant intronic to DGKB, associates primarily with the QRS complex when the allele is inherited paternally. These loci are not known to be imprinted, although paxillin has been shown to regulate expression, not imprinting, of the imprinted genes IGF2 and H19 through long-range chromosomal interactions between the IGF2 and H19 promoters and a shared distal enhancer54.

A common variant in HLF associates with a decreased risk of heart failure, a lower risk of CAVB and RBBB, a smaller ventricle and a shorter QRS duration. HLF encodes a member of the proline acidic-rich (PAR) protein family, a subset of the bZIP transcription factors that accumulate with robust circadian rhythms in tissues with high amplitudes of clock gene expression. The knockout of HLF along with two other PAR bZip transcription factors has been shown to lead to cardiac hypertrophy and left ventricular dysfunction in mice55.

In summary, our results demonstrate the advantage of analysing the QRS complex in a detailed manner. The use of whole-genome sequencing facilitates the discovery of associations with protein-coding variants that implicate particular genes in the biology of ventricular depolarisation and cardiac pathology. The findings provide new insights into the complex genetics of cardiac electrophysiology and will help direct future functional studies of cardiovascular diseases.

Methods

Icelandic data

The ECGs were performed between 1998 and 2015 at Landspitali—The National University Hospital, the sole tertiary care hospital in Iceland. They were done in both inpatient and outpatient setting, using the Philips PageWriter Trim III, Philips PageWriter 200, Philips PageWriter 50 and Philips PageWriter 70 cardiographs, and stored in the Philips TraceMasterVue ECG Management System. We analysed ten measures of the QRS complex from 405,732 ECGs of 81,192 individuals. These ten measures were the Q, R and S wave amplitudes and durations; QRS complex area and duration; ventricular activation time (VAT, time between the onset of the Q wave and the peak of the R wave); and the distance from the peak of the R wave to the nadir of the S wave (QRS peak-to-peak amplitude). We analysed each measure for 12 ECG leads: limb leads (I–III), augmented limb leads (aVR, aVL, aVF) and precordial leads (V1–V6). In addition, we analysed mean QRS duration over 12 leads, the Cornell voltage criterion (for men: (R amplitude in lead aVL + S amplitude in lead V3) × QRS duration; for women: (R amplitude in lead aVL + S amplitude in lead V3 + 0.8 mV) × QRS duration) and the Sokolow-Lyon voltage criterion ((S amplitude in lead V1 + R amplitude in lead V6) × QRS duration). We also used four phenotypes derived from automated ECG diagnoses: left anterior fascicular block (LAFB), left bundle branch block (LBBB), left posterior fascicular block (LPFB) and right bundle branch block (RBBB).

We used seven phenotypes describing measurements from echocardiograms: aortic root diameter (N = 19,513), ejection fraction (N = 17,109), left ventricular end-diastolic diameter (N = 18,487), left ventricular end-systolic diameter (N = 5,704), mitral regurgitation (N = 5,291), left ventricular posterior wall thickness (N = 18,626) and interventricular septum thickness (N = 18,775). We obtained these measurements from a database of 53,122 echocardiograms from 27,460 individuals, read by cardiologists at LUH between 1994 and 2015.

deCODE genetics has extensive medical information on cardiovascular diseases for use in association analyses. The QRS variants were tested for association with 22 cardiovascular diseases: aortic valve stenosis (N = 2457), atrial fibrillation (N = 14,710), coarctation of the aorta (N = 119), complete atrioventricular block (N = 1008), congenital malformations of cardiac septa (N = 1884), coronary artery disease (N = 38,918), dilated cardiomyopathy (N = 424), heart failure (N = 15,237), hypertension (N = 44,290), hypertrophic cardiomyopathy (N = 372), ischemic stroke (N = 5626), mitral valve disease (N = 940), myocardial infarction (N = 24,691), pacemaker insertion (N = 3578), patent ductus arteriosus (N = 628), perimyocarditis (N = 971), pre-excitation Wolff-Parkinson-White syndrome (N = 275), sick sinus syndrome (N = 3568), sudden cardiac death (N = 3128), supraventricular tachycardia (N = 1461), tetralogy of fallot (N = 60) and ventricular tachycardia (N = 945). The sample sets were based on discharge diagnoses from LUH from 1987 to 2017. The controls consisted of disease-free controls (up to 380,000), randomly drawn from the Icelandic genealogical database, and individuals from other genetic studies at deCODE.

The Icelandic Data Protection Authority and the National Bioethics Committee of Iceland (no. VSNb2015030024/03.01 and VSNb2015030022/03.01) approved the study.

Genotyping

In Iceland, 32.5 million sequence variants (MAF > 0.01%) were identified by whole-genome sequencing 15,220 Icelanders using Illumina standard TruSeq methodology to a mean depth of 35× (s.d. 8X) and imputed into 151,677 chip-typed individuals and their first- and second-degree relatives16,56. In the UK Biobank, genotyping was performed using a custom-made Affimetrix chip (UK BiLEVE Axiom)57 in the first 50,000 participants and Affimetrix UK Biobank Axiom array in the remaining participants58 (95% of the signals were on both chips). Wellcome Trust Centre for Human Genetics performed the imputation using a combination of 1000 Genomes phase 3 (ref. 59), UK10K60 and HRC61 reference panels, for up to 92.5 million SNPs.

Statistical analysis

We adjusted quantitative traits for sex and age at measurement, transformed them into a standard normal distribution using a rank-based inverse normal transformation, and used a linear mixed model implemented by BOLT-LMM62 to test them for association with genotypes. We used a logistic regression model to test binary traits for association with genotypes. For the deCODE data, we adjusted for sex, county of birth, current age or age at death (first- and second-order terms included), blood sample availability for the individual, and an indicator function for the overlap of the lifetime of the individual with the timespan of phenotype collection. For the UK Biobank data, we used 40 principal components to adjust for population stratification and adjusted for age and sex in the logistic regression model. The UK Biobank study included only white British individuals. In each study, we used LD score regression63 to account for inflation in test statistics due to cryptic relatedness and stratification. To combine the deCODE and UK Biobank results, we used a fixed-effect inverse variance method based on effect estimates and standard errors64. We used a likelihood-ratio test to compute all P values. In conditional analyses, we tested variants within ±1 Mb from the top variant and with correlation r2 < 0.05 with the top variant, and required the variants to be genome-wide significant after correcting for the top variant in a regression model.

Genetic correlation

We estimated the genetic correlation between pairs of traits using LD score regression (v.1.0.0)65 and pre-computed LD scores for European populations (downloaded from https://data.broadinstitute.org/alkesgroup/LDSCORE/eur_w_ld_chr.tar.bz2).

Depict

We used DEPICT40 to (1) prioritise candidate causal genes at associated loci, (2) highlight enriched pathways, and (3) identify tissues/cell types where genes at associated loci are highly expressed. DEPICT uses gene expression data derived from a panel of 77,840 mRNA expression arrays together with 14,461 existing gene sets based on molecular pathways derived from experimentally verified protein–protein interactions66, genotype–phenotype relationships from the Mouse Genetics Initiative67, Reactome pathways68, KEGG pathways69, and Gene Ontology (GO) terms70. DEPICT reconstitutes these 14,461 gene sets by calculating for each gene the probability of membership in each gene set, based on similarities across the expression data. Using these membership probabilities and a set of trait-associated loci, DEPICT tests whether any of the 14,461 reconstituted gene sets are enriched for genes at the trait-associated loci, and prioritises genes that share predicted functions with genes at other trait-associated loci. Additionally, DEPICT uses 37,427 human mRNA microarrays to search for tissues/cell types in which genes from associated loci are highly expressed (all genes harbouring variants in LD of r2 > 0.5 from the most significant variant). We ran DEPICT using all QRS-associated variants. We also used DEPICT to compute pairwise Pearson correlations between all reconstituted gene sets and clustered them by similarity using the Affinity Propagation method71.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.