Introduction

Atrial fibrillation (AF) is the most common cardiac arrhythmia worldwide, and many patients with AF develop an enlarged left atrium (LA)1,2. LA enlargement is associated with poorer prognosis of AF ablation outcomes and AF recurrences3, but may also increase the risk of stroke, adverse cardiovascular outcomes and death4,5.

LA enlargement is hypothesized to be a result of atrial remodeling, a persistent change in atrial structure or function6,7. However, pressure and/or volume overload commonly seen in conditions as hypertension, structural heart disease, mitral valve disease and heart failure may also induce change in atrial structure or function. As a consequence of atrial remodeling, it might trigger AF episodes, and then a vicious circle starts where AF episodes might trigger further atrial remodeling8,9. The degree of atrial remodeling can be assessed through measurement of LA volume with cardiovascular magnetic resonance (CMR) imaging10.

Co-existence of risk factors of AF and LA size and function makes it difficult to determine causality. Uncertainty exists if atrial remodeling is the cause or the consequence of AF. The hypothesis of a causal link between AF and LA volume may be tested by applying a Mendelian randomization approach (MR). Since AF associated genetic variants are randomly assigned at birth, a “naturally” randomized controlled trial can be performed, assuming that (1) genetic variants are reliably associated with AF, (2) genetic variants are independent of confounding factors and (3) genetic variants are only associated with LA volume through AF11. In present study, we use a two-sample MR approach to study the potential causal association between AF and LA size and function in the UK Biobank.

Results

In the current study, 4274 individuals from the general population were included (mean age 62.0 ± 7.5, 53.2% women). The mean body mass index (BMI) was 26.6 (SD 4.4) kg/m2, the prevalence of hypertension and diabetes mellitus type II were 32.1% and 3.5%. Body surface area (BSA) indexed maximum LA volume (LA max), minimum LA volume (LA min) and LA stroke volume were 35.9 ± 10 ml/m2, 14.1 ml/m2 [Interquartile range (IQR) 10.9–17.9], and 21.1 ± 5.6 ml/m2, respectively. LA ejection fraction (LA EF) was on average 59.4 ± 8.3%. A total of 36 individuals (0.8%) developed AF during the median follow-up of 2.0 years (IQR 1.8–2.4). Additional information on the cohort is provided in Table 1. A total of 24 genetic variants were removed from MR analyses to reduce risk of weak instrument bias (F-statistic < 10) and 2 genetic variants were excluded during data harmonization. A total of 84 genetic variants were taken forward for further analyses. The total amount of genetic variants varies per outcome due to MR-Steiger filtering. Data supporting the genetic variants selection (F-statistics, data harmonization, Steiger filtering) and single genetic variant-estimates for all outcomes can be found in Supplementary Table 1.

Table 1 Baseline characteristics.

Results of the MR analyses between AF and indexed LA volumes and ejection fraction are shown in Fig. 1 and Supplementary Table 2. Additional information on the MR analyses of the unadjusted LA volumes can be found in Supplementary Table 2. Sensitivity analyses were performed to test whether the assumptions of the MR analyses were fulfilled (Supplementary Table 3). MR-Steiger directionality test indicated that the 84 genetic variants known to be associated with AF explained ~ 7% of AF variance. The genetic variants explained more of AF variance than indexed LA max volume (1.7%), indexed LA min volume (1.7%), indexed LA stroke volume (1.8%) and LA ejection fraction (2.0%) (Supplementary Table 3).

Figure 1
figure 1

Summary MR estimates of the causal association between AF and LA size and function. The figure displays the MR estimates on the association between AF and body surface area indexed left atrial maximal volume (LA max), minimal volume (LA min), stroke volume and ejection fraction. Inverse-variance-weighted (random effects) model, MR-Egger, MR pleiotropy residual sum and outlier (MR-PRESSO), weighted median, weighted mode-based estimator and MR-Mix are shown. Outlier-corrected MR-PRESSO estimates are not included, since no genetic variants were removed in the MR-PRESSO analyses. On the X-axis, the beta coefficient and its upper and lower bound standard error are shown. The main analyses, i.e. inverse-variance-weighted random effects under the scenario of balanced horizontal pleiotropy or MR-Egger estimate under the scenario of unbalanced horizontal pleiotropy, are underlined per outcome. We considered a stringent two-sided Bonferonni corrected P < 0.05/7 outcomes statistically significant for the main analyses. Significant results for the main analysis are annotated with a single asterisk (*). A P-value threshold of P < 0.05 was adopted for the sensitivity MR analyses. Significant sensitivity MR analyses are annotated with a double asterisk (**). SE denotes standard error. The plot was made using the forestplot package (version 1.10.1, https://CRAN.R-project.org/package=forestplot) in R (version 3.6.3)59.

Using the Rücker framework, we found evidence for unbalanced horizontal pleiotropy in the MR estimates of indexed LA max and indexed LA stroke volume, indicated by significant Q–Q′ and MR-Egger intercepts (P < 0.05) (Supplementary Table 3). We therefore took forward the MR-Egger model as primary MR-method to assess the genetic association with indexed LA max and indexed LA stroke volume, whereas we adopted the inverse variance weighted random effects (IVW-RE) model for indexed LA min and LA ejection fraction. Using these models, we found evidence for a causal effect of genetic susceptibility to AF on indexed LA max (β = 1.56, SE = 0.53, P = 4.0 × 10–3), indexed LA min (β = 0.57, SE = 0.19, P = 2.0 × 10–3) and LA ejection fraction (β = − 0.89, SE = 0.25, P = 4.1 × 10–4) (Fig. 1). Weak-instrument bias was indicated within the MR-Egger estimate of AF on indexed LA max (I2GX = 0.94). We did not find evidence for a causal association between genetic susceptibility to AF and indexed LA stroke volume (β = 0.54, SE = 0.29, P = 6.98 × 10–2). Scatter- and forest plots of the MR analyses between AF and all LA dimensions are provided in Supplementary Figs. 28.

Several sensitivity analyses were performed to test whether valid conclusions on causal inference could be made under different assumptions of possible underlying pleiotropy or instrumental invalidity. We investigated whether the results were consistent under the scenario where a relativity large portion of the genetic instruments is invalid using the weighted median approach. Using this approach, we found additional evidence for a significant causal estimate between genetic susceptibility to AF and indexed LA max (β = 1.36, SE = 0.47, P = 3.83 × 10–3), indexed LA min (β = 0.89, SE = 0.30, P = 2.8 × 10–3) and LA EF (β = − 1.17, SE = 0.42, P = 5.84 × 10–3). We then investigated whether the results were consistent under the scenario in which a small proportion of the genetic variants are outliers using the MR-Lasso approach. Using this approach, we find the genetic associations between AF and indexed LA min (β = 0.57, SE = 0.19, P = 1.98 × 10–3) as well as LA ejection fraction (β = − 0.89, SE = 0.25, P = 4.09 × 10–4) to be robust to this scenario. However, the association between genetic susceptibility to AF and indexed LA max (β = 0.48, SE = 0.30, P = 1.13 × 10–1) was attenuated (Fig. 1).

We examined which genetic variant(s) drove the attenuation of the association between genetic susceptibility to AF and LA size and function by performing leave-one-out analyses. Results of the leave-one-out analyses using an IVW and MR-Egger approach are provided in Supplementary Table 4 and can be visually inspected in Supplementary Figs. 915. We observed that the MR-Egger estimate of AF on indexed LA max was attenuated after exclusion of rs67249485 (β = 1.41, SE = 0.82, P = 9.05 × 10–2), a genetic variant located on the long arm of chromosome 4 in the proximity of the PITX2 gene. However, the Wald estimate of rs67249485 did show a significant association for indexed LA max (β = 1.38, SE = 0.58, P = 1.65 × 10–2). The results are shown in Fig. 1. The leave-one-out analyses also showed an attenuation of IWR-RE estimates after exclusion of rs67249485 for indexed LA min (β = 0.36, SE = 0.22, P = 1.00 × 10–1), and LA EF (β = − 0.73, SE = 0.30, P = 1.52 × 10–2). The Wald statistics for the association between rs67249485 and indexed LA min (β = 1.13, SE = 0.36, P = 1.44 × 10–3) as well as LA ejection fraction (β = − 1.29, SE = 0.47, P = 5.94 × 10–3) were significant (Fig. 1).

We performed several quality controls to gain insights in the statistical validity of rs67249485 driving the association between genetic susceptibility to AF and LA dimensions and functions. Histograms of LA dimension distributions per AF increasing T allele showed absence of outliers which could drive current MR estimates (Supplementary Fig. 16). The genetic variant rs67249485 explained more variance for AF (MR-Steiger R2 = 1.58%) than for any LA size or function, which ranged up to a maximum explained variance of 0.23% for LA min. This indicates that the Wald estimates assessed the true causal direction (Supplementary Table 1).

Lastly, we performed multivariable MR analyses to assess whether the described genetic associations between AF and LA size and function are independent of blood pressure as it can affect both AF12 and LA size and function13,14. In brief, all multivariable Mendelian randomization analyses were similar to the univariable results. For example, the main MR-Egger analyses of AF on index LA max (β = 1.56, SE = 0.53, P = 4.0 × 10–3) had similar effect estimates as in the multivariable MR in which we corrected for systolic blood pressure (β = 1.68, SE = 0.53, P = 1.6 × 10–3). Please see Supplementary Table 2 for the full results and Supplementary Table 3 for the sensitivity analyses.

The MR analyses for the non-indexed LA volumes are provided in Supplementary Tables 14. The results were consistent to the results on the indexed LA volumes. The MR analyses for LA min (indexed and non-indexed) were repeated using genetic variant-outcome effect estimates obtained from their log-transformed equivalents to account for right skewness. Results were comparable to the primary analyses (Supplementary Table 2).

Discussion

Our study provides evidence to support the hypothesis that genetically susceptibility to AF increases indexed LA max, LA min and decreases LA ejection fraction. We pinpoint that rs67249485, near the PITX2 gene, is the driver of the association with indexed LA max and LA min and contributes strongly to the association with LA ejection fraction. However, we did not find evidence for a causal association between AF and LA stroke volume.

Our primary analyses indicate that genetic susceptibility to AF causally increases indexed LA max and LA min. A causal association between AF and LA stroke volume was not established. One potential explanation for this discrepancy is that AF increases indexed LA max and indexed LA min in a similar degree, nullifying the effect on LA stroke volume. Another potential explanation is that a larger passive conduit function of the LA could compensate for a decreased pump function at larger maximal LA volume through the Frank-Starling law15,16. This would result in similar LA stroke volume and lower LA ejection fraction15,16. In fact, we do find that genetic susceptibility to AF is associated with decreased LA ejection fraction.

The described associations between AF and indexed LA max, indexed LA min and LA EF were attenuated after exclusion of rs67249485, located in an intergenic region near the PITX2 gene17. Our results suggest rs67249485 to be the main driver of the genetic association between AF and indexed LA max and LA min as the main analyses were nullified after exclusion of rs67249485, while the Wald estimates of rs67249485 was significant. We still find a causal estimate between genetic susceptibility to AF and LA EF after exclusion of this variant, which may suggest that other genetic variants may also contribute to the genetic association between AF and LA EF. The validity of rs67249485 as important driver in the association between AF and LA size and function is statistically supported by several sensitivity analyses which indicate that the large effect of this genetic variants is very unlikely caused by measurement error, uneven population distribution or incorrect direction of causality. The biological role of PITX2 in AF development has been extensively studied and many potential mechanisms have been suggested, including deviations in LA myocyte automaticity, impaired response to oxidative stress, inflammation and a role in the embryonic development of the heart18,19,20,21. The PITX2 gene does not only increase the risk of AF development, but has been suggested as a determinant in the success of pulmonary vein ablation in preventing AF recurrence as well22. Our results provide evidence for another possible biological consequence of PITX2, as we show that LA volumes increase and LA ejection fraction decreases through the AF increasing T allele of rs67249485. However, further experimental validation is needed to investigate details of the mechanisms underlying the association of rs67249485, PITX2, AF and LA size and function.

One cardiovascular risk factor that could potentially affect our results is hypertension, as blood pressure is known to affect both AF and LA size and function12,13,14. We therefore performed additional multivariable MR analyses and find that the described associations between AF and LA size and function are independent of systolic blood pressure, diastolic blood pressure and pulse pressure23.

Our study has several strengths. The strengths include the use state-of-art genetic and CMR data. The MR design is less susceptible to confounding and strongly contributes to previous work in the field24. We excluded individuals with known prevalent AF and the MR was designed to study the effect of increased AF risk on LA dimensions before onset of the disease. Extensive sensitivity analyses were performed to further reduce the risk of pleiotropy and reversed causation and support our hypothesis.

Some limitations should be noted as well. First, the genetic variants used as proxy for AF explained approximately 7% of AF variance, which is a proportion of total genetic variance of 62% that has been suggested in a previous twin study25. We note that we did not include all previously established genetic variants associated with AF as the UK Biobank was used as discovery cohort in the most recent GWAS of AF17. We therefore took forward the largest set of genetic variants using effect sizes obtained without the UK Biobank to limit overlap of the exposure and outcome cohorts. In addition, a part of the heritability of AF and LA size is still unknown and there remains a gap between SNP-based and classic heritability estimates26. Several reasons for the missing heritability have been hypothesized, including the focus of GWAS on common genetic variants and the inclusion of individuals that are mainly from European descent26. In addition, GWAS assumes an additive model which overlook epistatic effects and possible interactions between genetics and the environment26. Further research to the genetics of AF by studying whole exome sequencing data27,28, expanding the reference genome with other ancestries29, gene–gene30,31 and gene–lifestyle interaction32,33 could increase our insights in AF and consequently the certainty of the described genetic association between AF and LA size and function. We did not have data on LA volume at the onset of atrial contraction and were therefore unable to differentiate the effect of AF on the LA conduit and pump function separately. Pleiotropy cannot be ruled out completely despite rigorous sensitivity analyses. We were unable to perform a bidirectional MR to further entangle the cause and consequence in the association between AF and LA size and function as the current cohort is too small to identify robustly associated genetic variants. Lastly, the AF associated variants were obtained from a multi-ethnic GWAS meta-analysis, while the outcome cohort included individuals that were mainly from European descent. Population stratification could introduce confounding in the MR analyses through hidden population structure if the ancestry is correlated with both the phenotypes and genotypes34. However, we believe this to be unlikely given the stringent adjustments for genetic ancestry in the GWAS of AF and in the regression analyses on atrial size and function35.

In conclusion, we provide evidence that a higher genetic susceptibility to AF increases indexed LA max and LA min, while it decreases LA EF. We pinpoint that the genetic variant rs67249485, near the PITX2 gene, drives the association between AF and indexed LA max and LA min and contributes strongly to the genetic association between AF and LA EF. The association between AF and LA EF was robust to multiple sensitivity analyses and indicate that genetic susceptibility to AF causally decreases LA EF.

Methods

Study population

The UK Biobank is a large, population-based cohort that included 503,325 individuals via general practitioners of the UK National Health Service (NHS) between 2006 and 2010. Informed consent was obtained from all included individuals and the North West Multi-centre Research Ethics Committee approved of the study and the North West Multi-centre Research Ethics Committee approved of the study36. The UK Biobank study has been carried out in accordance with relevant guidelines and regulations and has approval from all relevant institutional review boards, including the North West Multi-centre Research Ethics Committee for the UK, the National Information Governance Board for Health and Social Care for England and Wales, and the Community Health Index Advisory Group for Scotland36. Hospital episode statistics were available up to 31-03-2017 for English participants, 29-02-2016 for Walsh participants and 31-10-2016 for Scottish participants. Individuals with contoured CMR data, as previously performed by Petersen et al., were included in the current study37. Individuals were excluded in case of missing information on body surface area or any covariates (please see below), failure of genetic quality control (including heterozygosity, high missingness and a discrepancy between reported and inferred gender), familial relatedness, or a medical history of mitral valve disease, heart failure, valvular surgery, pulmonary hypertension or prevalent AF at the time of CMR. Definitions of prevalent incident and incident disease are presented in Supplementary Table 1 and a flowchart depicting the study sample selection is shown in Supplementary Fig. 1.

Left atrial size and function

CMR protocol and image analyses of left atrial dimensions have been described previously10. In brief, all CMR examinations in UK Biobank were performed on a clinical wide bore 1.5 T scanner (MAGNETOM Aera, Sygno Platform VD13A, Siemens Healthcare, Erlangen, Germany) in Cheadle, United Kingdom. The LA dimensions were manually analyzed by two core laboratories based in London and Oxford and the returned volumes were used in the current study37. In each CMR examination, endocardial LA contours were manually traced at end-systole (maximal LA area) and end-diastole (minimal LA area) in the HLA (4-chamber) view and VLA (2-chamber) view. The biplane method was applied to calculate maximal and minimal areas. Maximal LA volume (LA max volume) is defined as the end of left ventricular systole. Minimal LA volume (LA min volume) is defined as the end of left ventricular diastole. LA stroke volume and LA ejection fraction were calculated as followed: LA stroke volume = (LA max − LA min) and LA ejection fraction = 100 × (LA max − LLA min)/(LA max).

LA volumes (LA max and LA min and LA stroke volume) were indexed to body surface area (BSA) to account for body size as well as gender differences12. We took forward these seven outcomes to evaluate the association between AF associated genetic variants and LA size and function. As sensitivity analyses, we log-transformed LA min (indexed and non-indexed) to account for right skewness.

Genotype and imputed data

The Wellcome Trust Centre for Human Genetics performed genotyping and quality control before imputation in the individuals of UK Biobank, and imputed to HRC v1.1 panel. The quality control of samples and variants, and imputation was previously described in detail38.

Genetic variants: atrial fibrillation

In this study, 111 genetic variants associated with AF (P-value < 5 × 10–8) from the prior GWAS of Nielsen et al. were selected as genetic instruments in current analyses39. The effect sizes of the genetic variants associated with AF within the independent cohorts of the Broad AF Study, BBJ, EGCUT, PHB, SiGN and the Vanderbilt AF Registry published by Roselli et al. were used (number of cases = 32,957, number of controls = 83,546)17. We opted for this approach to obtain one of the largest sets of robust AF genetic instruments, while also being able to use effect sizes that were independent of the UK Biobank to limit overlap of the exposure and outcome cohorts. One genetic variant (rs17005647) was a priori removed as we were unable to precisely calculate the beta with the provided odds ratio of 1.0.

Genetic variants: left atrial size and function

Effect estimates of the AF associated genetic variants on LA size and function were obtained from all individuals included in the current study. Effect sizes were obtained by performing linear regression analyses on LA size and function, which were corrected for age during the imaging visit, sex, 30 principal components and genotyping array.

Genetic variants: blood pressure traits

Effect estimates of the AF associated genetic variants on systolic blood pressure, diastolic blood pressure and pulse pressure were obtained from a cohort of 408,212 unrelated individuals from the UK Biobank that were not included in the estimates of LA size and function. Systolic and diastolic blood pressure values were obtained during the baseline visit through two automated and/or two manual blood pressure measurements and the average of all measurements was used. The automated measurements were corrected according to previously described methodology40. Pulse pressure was calculated by subtracting diastolic from systolic blood pressure. Blood pressure altering medication use was taken into account by adding respectively 15, 10 mmHg and 5 mmHg to the blood pressure trait41. Effect sizes were obtained by performing linear regression analyses, which were corrected for age during the baseline visit, sex, 30 principal components and genotyping array.

Mendelian randomization analysis

The genetic variants were tested for weak instrument bias (F-statistic) and reversed causation (MR-Steiger). F-statistics were calculated per genetic variant using the following formula: F = R2(n − 2)/(1 − R2). Here, n is the sample size of the exposure and R2 is the amount of variance of the exposure explained by the genetic variant42. An F-statistic < 10 was considered to indicate weak-instrument bias and these genetic variants were removed from further analyses. Reversed causation was assessed through MR-Steiger filtering and genetic variants with a significantly higher (P < 0.05) R2 for the outcome than for the exposure were removed43. The R2 for AF (on the liability scale)44 and linear outcomes45 were calculated based on the summary statics provided in Supplementary Table 1 using previously established formulae.

MR estimates were generated using inverse-variance weighted random effects meta-analysis. The Rucker framework was applied to assess heterogeneity and thus potential pleiotropy within the MR effect estimates46. Balanced horizontal pleiotropy was assessed by calculating Cochran’s Q (P < 0.05) and I2 index (> 25%) as indicators of heterogeneity within the IVW model47. Potential unbalanced pleiotropy was assessed by performing MR-Egger regression as the MR-Egger allows for a non-zero intercept48. The Rucker framework than assesses the difference between heterogeneity within the IVW effect estimate (Cochran’s Q) and heterogeneity within the MR-Egger regression (Rucker’s Q), called Q–Q′. A significant Q–Q′ (P < 0.05), in combination with a significant non-zero intercept of the MR-Egger regression (P < 0.05), was considered to indicate unbalanced horizontal pleiotropy. Under this scenario, we report the MR-Egger effect estimates as it provides a causal estimate if the general InSIDE (Instrument Strength Independent of Direct Effect) assumption holds48. Weak instrument bias within the MR-Egger regression was assessed by I2GX. An I2GX of > 95% was considered low risk of weak instrument bias within the MR-Egger estimates49. The main analysis consisted of either the IVW-RE (under the scenario of balanced horizontal pleiotropy) or the MR-Egger estimate (under the scenario of unbalanced horizontal pleiotropy).

Additional sensitivity analyses included the Mendelian randomization-Pleiotropy Residual Sum and Outlier (MR-PRESSO)50, MR-Lasso51, leave-one-out analyses52,53, weighted median54, weighted mode55 and MR-Mix56, multivariable MR-IVW23, multivariable MR-Egger57 and multivariable MR-PRESSO50. These all have their own strength and weaknesses and jointly provide information on the possibility of a true causal relationship. Outlier robust methods include MR-PRESSO (excludes outliers), leave-one-out analyses (excludes genetic variants one by one and reperforms IVW and MR-Egger analyses) and MR-Lasso (downweights outliers). Weighted median (majority valid), weighted mode and MR-MIX (plurality valid) generally have the potential to estimate true causal effects when larger proportions of genetic variants violate MR assumptions (generally at the cost of power). The multivariable MR-IVW23, multivariable MR-Egger57 and multivariable MR-PRESSO50 analyses were performed to correct for the potential influence of systolic blood pressure, diastolic blood pressure and pulse pressure in the causal association between AF and LA size and function12,13,14. Effect estimates for blood pressure traits were obtained in an independent cohort from the UK Biobank (See: Genetic variants: blood pressure traits). Weak instrument bias within the multivariable MR setting was considered unlikely if Qx1 and Qx2 were larger than the critical value at the χ2, calculated by subtracting one degree of freedom from the amount of SNPs at a P value of 0.0523. Potential pleiotropy within the multivariable MR setting was assessed using the Qa, which was considered to indicate potential pleiotropy when larger than the critical value on the χ2 distribution as calculated by the amount of SNPs minus two degrees of freedom at a P value of 0.0523. A multivariable MR-Egger intercept with a P value < 0.05 was considered prove of unbalanced horizontal pleiotropy and the MR-Egger regression to provide a robust causal estimate57.

Causal effect estimates are reported in β values, since LA volumes and fractions are continuous variables. The main analyses were considered significant at a Bonferonni corrected α = 0.05/7 outcomes. For the sensitivity analyses, we adapted α = 0.05 to ascertain statistical significance when replicating the findings of the main analysis. Continuous variables are displayed as mean ± standard deviation when normally distributed and as median and interquartile ranges when skewed. Categorical variables are displayed as percentages. Regression analyses to obtain genetic variant-outcome associations were performed using statistical software STATA 15 (StataCorp LP)58. MR analyses were performed using R (version 3.6.3)59, the TwoSampleMR package 0.5.360, MR-PRESSO (version 1.0)50, MR-Lasso51, MR-mix (version 0.1.0)56, MendelianRandomization (version 0.5.0)61 and MVMR (version 0.3)23.