Genetic and environmental determinants of diastolic heart function

Diastole is the sequence of physiological events that occur in the heart during ventricular filling and principally depends on myocardial relaxation and chamber stiffness. Abnormal diastolic function is related to many cardiovascular disease processes and is predictive of health outcomes, but its genetic architecture is largely unknown. Here, we use machine learning cardiac motion analysis to measure diastolic functional traits in 39,559 participants of the UK Biobank and perform a genome-wide association study. We identified 9 significant, independent loci near genes that are associated with maintaining sarcomeric function under biomechanical stress and genes implicated in the development of cardiomyopathy. Age, sex and diabetes were independent predictors of diastolic function and we found a causal relationship between genetically-determined ventricular stiffness and incident heart failure. Our results provide insights into the genetic and environmental factors influencing diastolic function that are relevant for identifying causal relationships and potential tractable targets.

and environmental factors influencing diastolic function that are relevant for identifying causal relationships and potential tractable targets.
Diastole is not a passive phase of the cardiac cycle, but is a complex sequence of interrelated physiological processes dependent on myocardial relaxation, stiffness and recoil, that are modulated by loading conditions, heart rate, and contractile function. Diastolic function therefore plays a central role in determining left ventricular filling and stroke volume with dysfunction shown to be a predictor of major adverse cardiovascular events and all cause mortality 1 . Decline in diastolic function is also a hallmark of cardiac ageing which occurs through multiple pro-fibrotic and energetic pathways 2, 3 . While several candidate genes have been implicated in various systolic function phenotypes through genome wide association studies (GWAS) 4,5 , the genetic architecture of diastolic function and causal associations with disease are largely unknown. Efforts to better define the molecular mechanisms of diastolic dysfunction could enable the development of innovative therapies for many cardiovascular disease states.
Pre-clinical models of diastolic dysfunction are associated with alterations in left ventricular stiffness on atomic force microscopy that occur at the level of the cardiomyocyte sarcomere as well as due to extracellular matrix protein expansion 6 . Such tissue level changes can be assessed at macroscopic scale in human populations through analysis of diastolic mechanics. Here we use data from participants in UK Biobank with cardiac magnetic resonance imaging (CMR) 7 , and apply deep learning computer vision techniques for precision motion analysis to derive image-based phenotypes of diastolic function 8,9 . In a GWAS of diastolic traits we identify associated loci that map to genes involved in actin assembly, cardiac myocyte survival, and heart failure phenotypes. We also describe the relationship between diastolic function and cardiovascular risk factors, and identify potential causal relationships with disease through Mendelian randomisation.

Study Overview
We analysed CMR data from 39,559 participants in UK Biobank using machine learning segmentation and motion tracking to measure three validated parameters of diastolic function -radial and longitudinal peak early diastolic strain rate (PDSR rr and PDSR ll ) (Fig. 1), and maximum body surface area-indexed left atrial volume (LAV max i ) 10 Table 1. For the GWAS the population was partitioned into discovery and validation sets by the release of data tranches by UK Biobank. To assess the association between these diastolic function traits and other clinical measurements, we further considered a broad selection of 30 imaging and 110 non-imaging phenotypes that included biophysical data and circulating biomarkers (Supplementary Data 1). Independent GWASs were undertaken for each image-derived phenotype and heritability estimated. We used a phenome-wide association study (PheWAS) to identify multiple phenotypes associated with a polygenic instrumental variable score (PIVS) for diastolic function. Potentially causal associations were examined using 2-sample Mendelian randomisation (MR). The results are reported in accordance with GWAS reporting guidelines and a checklist is provided as Supplementary Information.

Imaging and non-imaging phenotype associations
Strain rates declined with age and were lower in men (P < 10 −16 for both associations) (Fig.  2), but no univariable association was observed between age and LAV max i (Extended Data  Fig 2). Multiple linear regression analysis was used to develop a model for predicting each diastolic trait from demographic, haemodynamic and cardiovascular risk factors (Fig. 3a,  Extended Data Fig 3a). In this multivariable analysis strain rate and left atrial volumes were negatively associated with age, male sex and pulse rate in the full model (P < 10 −16 for all associations). Significant associations were also observed for body surface area (BSA) and systolic blood pressure (SBP). Diabetes also added significantly to the associations with the diastolic function traits in the model (PDSR ll : P = 2.36 × 10 −8 ; PDSR rr : P = 9.98 × 10 −6 ; LAV max i : P = 1.04 × 10 −3 ).
We investigated the association between image-derived measures of atrial, ventricular and aortic function with a broader range of non-imaging phenotypes using regularized regression analysis (Extended Data C-reactive protein (CRP), a circulating biomarker of inflammation, showed a positive relationship with serum triglycerides, but we found no circulating biomarkers independently associated with diastolic function. We found that reduced peak diastolic strain rates were associated with reduced LAV max i . Left atrial function was related to indicators of right ventricular function emphasising their functional interdependence 11 .

Genetic architecture of diastolic function traits
Genome-wide common and rare variant association analyses of diastolic function traits-The SNP-based heritability (i.e. the proportion of variance per trait explained by all considered SNPs) was 12% for PDSR ll , 13% for PDSR rr , and 21% for LAV max i . The observed genetic correlation between the diastolic function traits was 0.22 (SE 0.07) between PDSR ll and LAV max i , 0.12 (SE 0.08) between PDSR rr and LAV max i , and 0.85 (SE 0.04) between PDSR ll and PDSR rr .
In total, we identified 9 independent loci from our GWAS analyses, 5 loci for PDSR rr , 4 for PDSR ll , and 2 for LAV max i (2 loci are shared between PDSR rr and PDSR ll ). Within the discovery set, we identified 5 independent loci (LAV max i , 1; PDSR rr , 3; PDSR ll , 1) reaching genome-wide significance (P = 5 · 10 −8 ; Supplementary Fig 3), which were also significant in the validation dataset also (P < 0.05/5). Considering the full dataset, the number of significant independent loci increased to 9 with 2 additional loci associating with PDSR rr , 1 additional with LAV max i , and 1 additional with PDSR ll (Fig. 4).
Variant annotation-Summary information for the 9 loci identified using the full GWAS dataset and two predicted loss of function variants are presented in Table 1 (further information can be found in Supplementary Material, Supplementary Fig 5  Europe PMC Funders Author Manuscripts gene mapping presented as the "likely gene" given by evidence of a functional effect on a gene (details in Supplementary Material), additional heart-related phenotype associations, or a previously reported mechanism linking the gene to diastolic function. Taking lead variants identified from GWAS and the loss of function analysis were able to highlight several structural genes associated with diastolic function that also have a known role in myocardial contractility (e.g. TTN, PLN, GJA1), and in the functional maintenance and stress-response of the cytoskeleton (e.g. FHOD3, BAG3) 12 . Moreover, we were also able to identify a novel link between the NPR3 locus and left atrial volume. The signal co-localizes with a previously discovered association with blood pressure traits (systolic, diastolic and mean arterial blood pressure). The C-allele of the lead SNP (rs1173727) at this locus increases NPR3 expression, and is associated with increased blood pressure and LAV max i , and an increase in risk of heart failure (Supplementary Material). The NPR3 gene encodes the C-type natriuretic peptide receptor, which has a high drug tractability score (https://platform.opentargets.org/target/ENSG00000113389), making it a potential therapeutic target.

Creation of polygenic instrumental variable scores (PIVS and PheWAS)-
PIVSs for each diastolic function trait consisted of 20 SNPs for PDSR rr , 15 SNPs for PDSR ll , and 8 for LAV max i . The PIVS explained 1.5 % of the variability of PDSR rr , 1.1 % of PDSR ll and 0.2 % of LAV max i . There was good agreement between the distribution of the PIVS in the UK Biobank participants with and without CMR indicating no systematic bias in genetic architecture ( Supplementary Fig 9). The Pearson correlation coefficient for the PIVS for PDSR ll and PDSR rr was 0.35 whereas the correlation coefficient between LAV max i and PDSR ll or PDSR rr , respectively, was much lower (<0.01). PheWAS was undertaken and we considered traits that have been previously associated with cardiac phenotypes in the literature, but in addition included an unbiased selection of phenotypes for exploration. In total, we considered 71 quantitative phenotypes and 63 (binary) disease endpoints (Supplementary Data 1). Out of these, 31 phenotypes were significantly associated (P adj < 0.05) with at least one of the diastolic function PIVSs after leave-one-out cross validation (Fig. 5). Some of the identified PheWAS associations are consistent with the phenotype correlation analysis (e.g. pulse rate and blood pressure). We also confirmed associations between diastolic function and previously reported biomarkers of heart failure (e.g. SHBG 13 and IGF-1 14 ). Furthermore, we identified an association of PDSR rr to heart failure, cardiomyopathy and dilated cardiomyopathy, implicating diastolic function in cardiovascular endpoints.
Mendelian randomization-Diastolic dysfunction is a substrate for the subsequent development of heart failure and, in observational studies, diabetes and hypertension are associated risk factors 15 . Here we used Mendelian Randomization (MR) to identify potential causal relationships between diastolic function as an exposure and two key clinical outcomes  Tables 2 to 4), consistent with findings from pre-clinical models 16 . Diastolic blood pressure was causally associated with PDRS rr , and had a bidirectional association with PDSR ll . Systolic blood pressure was causally associated PDSR ll , but not PDSR rr . In addition, higher total peripheral resistance was strongly associated with higher PDSR ll , PDSR rr and LAV max i , adding to the evidence implicating ventriculo-vascular coupling in the development of the diastolic dysfunction 17 .
We also identified a potential causal relationship between lower PDSR rr (stiffer ventricle) and increased risk of heart failure Supplementary Fig 11, which was further corroborated using GWAS summary results 18 from the HERMES consortium (Supplementary Table 5), a GWAS meta-analysis from 47309 heart failure cases and 930014 controls. The magnitude of the effect observed in the MR analysis is consistent with the observational epidemiological estimate, derived from correlating PDSR rr with incident heart failure (Extended Data Fig 6). We found no causal relationship between longitudinal PDSR ll and heart failure, and neither was one observed in our epidemiological analysis (Extended Data Fig 6).
Diastolic dysfunction is frequently present in diabetic patients 19 , however the effects are mostly mediated by an increased risk of coronary artery disease 18 . We found parameter estimates that support a causal relationship between diabetes as an exposure and diastolic function as an outcome, as well as a potential link with instruments for lipid profiles.
Lastly, we found a causal association between LAV max i and an outcome of atrial fibrillation 20 , but there was no evidence that ventricular stiffness also has a causal association.

Discussion
Diastole is a complex series of molecular, biophysical and electro-mechanical processes that initiate contractile deactivation and promote efficient ventricular filling. Impairment of these coordinated mechanisms may lead to diastolic dysfunction which is associated with the presence of multiple cardiovascular risk factors leading to reduced quality of life and higher mortality 21,22 . Here, we used deep learning cardiac motion analysis to perform the first reported GWAS of diastolic function traits with the aim of determining tractable causative mechanisms. We found that diastolic function was a heritable trait with associations in loci related to myofilament mechanics, protein synthesis during mechanical stress and regulation of cardiac contractility. Furthermore, we find a role for a gene implicated in endotheliumderived signalling in diastolic function that is a potential therapeutic target 23 . Lastly, through Mendelian randomisation we observe a causal relationship between genetically-determined diastolic function and heart failure outcomes.
A decline in diastolic function is a feature of the ageing heart and we found that age was a strong independent predictor of diastolic function, with a greater decrease present in males. Outcome studies have suggested that this is a prognosticallybenign feature of healthy ageing that is not related to adverse effects of cardiac senescence 2,24,25 . Changes in titin protein phosphorylation, myocardial redox state and impairment of nitric oxide signalling have been proposed as potential mechanisms 26 , and clinical studies indicate that age-related myocardial fibrosis, cardiomyocyte hypertrophy, and reduced microvascular density, may be a consequence rather than an initiating cause of diastolic dysfunction 27 . Non-invasive imaging biomarkers of fibrosis have also shown promise in identifying biologically relevant pathways for myocardial fibrosis in adult hearts 28 .
We found that diabetes was causally associated with impaired diastolic function after excluding potentially confounding instruments. In epidemiological analyses this relationship was independent of age, BSA, and systolic blood pressure. Increased myocardial stiffness is recognised as one of the earliest, and potentially reversible, manifestations of myocardial dysfunction in diabetes 29 . Several underlying mechanisms related to insulin resistance have been proposed that include altered cardiac energetics and accumulation of advanced glycation end products that promote ventricular stiffness 30 . We also observed a unidirectional causal relationship between genetically-determined diastolic function and an outcome of heart failure, as well as associations with cardiovascular endpoints and circulating biomarkers of heart failure through PheWAS. Longitudinal cohort studies have suggested that persistence or progression of diastolic dysfunction is a risk factor for subsequent heart failure 15 , and our findings suggest that ventricular stiffness is a substrate for the evolution of mixed aetiology heart failure. We also found a unidirectional causal association between left atrial volume and atrial fibrillation, suggesting that it is atrial remodelling that drives this arrhythmic outcome 31 . Lipid profiles are associated with adverse changes in cardiac structure and systolic function, and our findings extend that causal association to diastolic traits 32 .
Our study provides insights into the biological basis of diastolic function with potential implications for therapy development. We identified common variants within genes implicated in cardiomyopathies (e.g. BAG3, FHOD3, PLN), suggesting sarcomere homeostasis during mechanical stress may affect diastolic function in both health and disease 33 . Phospholamban (PLN) is a key regulator of cardiac diastolic function, which modulates sarcoplasmic reticulum calcium-ATPase activity 34 . Common variants in this gene are also associated with trabeculation which has been implicated in promoting ventricular filling 9 . Speckle-tracking echocardiography of Pln knockout mice reveals alterations in longitudinal strain but not radial strain 35 , which is concordant with our observed associations with diastolic function and may relate to associated changes in ventricular geometry 36 . Although there is a genetic correlation between strain rate vectors the majority of SNPs used as polygenic instruments were independent of each other for these traits.
We also identified a potential therapeutic target through the association of variants at the locus of NPR3 influencing diastolic function and risk of heart failure. Previous studies have highlighted its role in blood pressure control 37 , and in mediating the cardioprotective effects of cardiomyocyte and fibroblast-released CNP 23 .
This analysis has some limitations. UK Biobank is a large-cross sectional study that is subject to selection bias and latent population stratification, however risk factor associations appear to be broadly generalisable 38 . The population is predominantly European and further work is required to explore diastolic traits and outcomes in people of diverse ancestries. Echocardiography has been the cornerstone of assessing diastolic function by characterising features of ventricular relaxation, stiffness and recoil 39 . However, feature-tracking CMR has excellent agreement with speckle-tracking echocardiography 40 and invasive measures of diastolic function 41 . While analysis of myocardial deformation is performed throughout the cardiac cycle the measures of early diastolic strain rate may not capture variation in active relaxation prior to ventricular filling. While the relationship between quantitative and dichotomous outcomes may be non-linear such a relationship has not been observed between other genetically-driven diastolic traits and outcomes 42 .
In conclusion, we found that diastolic function is a heritable trait that is causally upstream of incident heart failure. Associated common variants are related to genes that maintain functional homeostasis under biomechanical stress. We also identify a gene encoding an atrial natriuretic peptide receptor as a potential therapeutic target for modulating aspects of diastolic function.

Methods
All analyses in this study can be found here https://github.com/ImperialCollegeLondon/ diastolic_genetics/ 43 and were conducted with R version > 3.6.0.

Participants
For UK Biobank, approximately 500,000 community-dwelling participants aged 40-69 years were recruited across the United Kingdom between 2006 and 201044. All subjects provided written informed consent for participation in the study, which was also approved by the National Research Ethics Service (11/NW/0382). Our study was conducted under terms of access approval number 28807 and 40616. A range of available data were included in this study comprising genotyping arrays and whole exome sequencing, cardiac imaging, health-related diagnoses, and biological samples.
There are 488,252 genotyped participants of which 200,640 have whole exome sequencing. We partitioned 39,559 participants with both CMR imaging and genotyping array data into two tranches by date of release from UK Biobank providing a discovery dataset of 26,893 participants and a validation dataset of 12,666 participants.

Imaging protocol
A standardised CMR protocol was followed to assess cardiac structure and function using two-dimensional retrospectively-gated cine imaging on a 1.5T magnet (Siemens Healthineers, Erlangen, Germany). A contiguous stack of images in the left ventricular short-axis plane from base to apex was acquired, with long axis cine imaging in the two and four chamber views. Each cine sequence had 50 cardiac phases with an acquired temporal resolution of 31 ms7. Transverse cine imaging was also performed in the ascending and descending thoracic aorta. All imaging phenotypes used for the analysis underwent quality control assessment 8 . Participants also underwent a resting 12 lead electrocardiogram which was automatically analysed using proprietary software (CardioSoft, GE Healthcare).

Cardiac image analysis
Segmentation of the short-axis and long-axis cine images in UK Biobank was made using fully convolutional networks, a type of deep learning neural network, which predict a pixel-wise image segmentation by applying a number of convolutional filters onto each input image for feature extraction and classification 9 . The accuracy of image segmentation on the UK Biobank dataset is equivalent to expert human readers 45 . End-diastolic volume, end-systolic volume, stroke volume, and ejection fraction were determined for both ventricles. Left ventricular myocardial mass was calculated from the myocardial volume assuming a density of 1.05 g.ml -1 . Left atrial volume was calculated from the segmented images using the biplane area-length formula V = The aorta was segmented on the cine images using a spatio-termporal neural network 47 . The maximum and minimum cross-sectional areas were derived from the segmentation and distensibility calculated using estimates of central blood pressure obtained using peripheral pulse-wave analysis (Vicorder, Wuerzburg, Germany) 8 .
Motion tracking was performed on the cine images using non-rigid image registration between successive frames (in GitHub repository ukbb_cardiac) 48,49 . To reduce the accumulation of registration errors motion tracking was performed in both forward and backward directions from the end-diastolic frame and an average displacement field calculated 8 . This motion field was then used to warp the segmentation contours from end-diastole onto successive adjacent frames. Circumferential (E cc ) and radial (E rr ) strains were calculated on the short axis cines by the change in length of respective line segments ( Fig. 1A) as E dir = ΔL dir L dir , where dir represents the direction, L dir the length of a line segment along this direction and ΔL dir its change over time. Motion tracking was also performed on the long-axis four-chamber cines to derive longitudinal (E ll ) strain. Peak strain for each segment and global peak strain were then calculated (Fig. 1B). Strain was measured from slices acquired at basal, mid-ventricular, and apical levels. For comparison between each component absolute strain values are reported. Strain rate was estimated as the first derivative of strain and peak early diastolic strain rate in radial (PDSR rr ) and longitudinal (PDSR ll ) directions was detected using an algorithm to identify local maxima (in GitHub repository peak_detection) (Fig. 1C).

Non-imaging phenotypes
In total we consider 110 non-imaging cardiovascular-related phenotypes in UK Biobank participants for the phenotype regression analysis and the genetic analysis. These phenotypes contain information acquired by touch screen questionnaire, interview, biophysical measurement, hospital episode statistics, primary care data and biochemical analysis of venous blood. Details of how each phenotype was acquired are available on the UK Biobank Showcase (http://biobank.ctsu.ox.ac.uk/crystal/). It should be noted that the biochemical markers used here were acquired at the initial assessment visit that preceded imaging assessment. Also, note that not all phenotypes were used in both the phenotype and the genetic analysis (e.g., due to lack of available data at the imaging visit). We refer to the Supplementary Material both for details on the definition of the considered phenotypes and for information on the inclusion of specific phenotypes for each analysis.

Statistical significance testing and multiplicity control
We consider in all phenotype analysis a P-value < 0.05 as significant. Where not stated otherwise, we control the false discovery rate with the Benjamini-Hochberg adjustment. Significance thresholds and decision criteria for GWAS significant loci and causality assessment (Mendelian Randomization) are described in the respective sections and/or in the Supplementary Material.

Phenotype Association Analysis
Continuous variables are expressed as mean ± standard deviation (SD). Differences in continuous variables between groups were performed using Student's t-test. Univariable and multiple linear regression analysis was used to explore the phenotype relationship between each diastolic parameter and cardiovascular risk factors. To identify relationships between diastolic function and a broader range of imaging and non-imaging phenotypes, including circulating biomarkers, we used the least absolute shrinkage and selection operator (LASSO) with stability selection, to optimise the model coefficients. We then ran regression diagnostics on the model with the selected variables, to exclude a possible collinearity inappropriately influencing our model (see Supplementary Material for details on the phenotype analysis and LASSO analysis procedure).

Genotyping and sample QC
Genotyping of UK Biobank participants has been described elsewhere in detail 50

GWAS analysis
For the genetic analysis, there were 34,242 participants of European ancestry (see Supplementary Material for criteria) providing a discovery dataset of 23,321 participants and a validation set of 10,924 participants. GWAS analyses for the three diastolic function traits and additional quantitative traits of interest (as described for the causality assessment) were performed with BOLT-LMM (version 2.3.2) which accounts for ancestral heterogeneity, unknown population structure, and sample relatedness 53,54 . GWAS analyses were adjusted for imaging traits for the first ten genetic principal components, sex, age at time of MRI, the genotyping array and the MRI assessment center and for non-imaging quantitative traits for the first ten principal components, sex, age at measurement of the trait and the genotyping array. GWAS analyses for clinical endpoints of interest (binary endpoints) were conducted with PLINK2 and adjusted for the first ten principal components, sex, age at baseline and the genotyping array. Post-GWAS filtering removed any SNPs with a Hardy-Weinberg equilibrium p-value < 0.05 and MAF < 0.005.

Assessment of shared genetic architecture
For the assessment of shared genetic architecture between diastolic function traits, LD score regression (LDSC (LD SCore) v1.0.1, PMID 25642630) was used to obtain a genetic correlation score between each pair of traits.

Variant annotations
Lead variants for each locus were assigned causal genes, where possible, using a combination of variant annotations and additional functional genomic data sources (colocalisation). Each lead variant was systematically tested for any evidence of functional consequence using VEP. In addition, QTL evidence was extensively searched using Open Targets Genetics 55 . Where eQTL data was available for the locus, the full summary statistics were downloaded to assess colocalisation (see Suppplementary Material).
Variant Effect Predictor (VEP) 56 and Loss-of-Function Transcript Effect Estimator (LOFTEE) 57 plugin were applied on all genomic variants of WES data. In the present study, we considered the genomic variants predicted by LOFTEE with high-confidence label "HC", non-dubious (no "LoF flag" such as variants that located in poorly conserved exons, or splice variants that affect NAGNAG sites or non-canonical splice regions), and minor allele frequency < 0.05, as a Loss-of-function (LoF) mutation.

LoF association analysis
A Loss-of-Function (LoF) carrier indicator was created for each WES sample and each of the human protein-coding genes based on the collapsed information of LoF annotations. A subject was considered as an LoF carrier of the gene if there was at least one LoF mutation (based on methods in the variant annotation section), and a non-carrier if there was none. We then conducted the association test between LoF carrier indicator and the three diastolic function imaging phenotypes. Linear regression was performed with the adjustment of sex, age at time of MRI, and the top ten genetic principal components. The association results were further filtered as those with at least two carriers and the endpoint available. The association was considered significant after multiple testing correction at α = 0.05 (FDR,

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts calculated for three diastolic function traits). We identified 18,660 participants with both whole exome sequencing data and CMR imaging data.

Polygenic instrumental variable scores (PIVS)
Candidate variants for PIVS for the three diastolic function traits (LAV max i , PDSR ll , PDSR rr ) were obtained based on the respective GWAS (full imaging cohort) results by performing clumping (PLINK 1.9) using an LD threshold of R 2 = 0.1 (in a window of 1000kb) and considering all SNPs with P < 10 −6 . Unlike more traditional polygenic risk scores we do not use thousands of variants as instruments but aim to identify a set of instrumental variables that are minimally correlated. This comes with the price of a relatively small set of instruments that explains less variability of a trait, but can be used as proper instruments for the Mendelian randomization analysis. Candidate variants were included in multivariate linear modelling evaluated on the European subset of the full imaging cohort with the first ten genetic principal components, age at MRI, sex, genotyping array and the MRI center as additional covariates and the respective diastolic function trait as dependent variables. The diastolic function traits were scaled to standard deviation (sd) 1 prior to the model estimation -therefore, a unit change in the PIVS score represents a change of 1 sd unit in the respective diastolic function trait. PIVS estimates per individual were then calculated by multiplying the observed genotype with the estimated beta from the multivariate linear model for each SNP and summing these values up. Missing genotypes were imputed using a mean imputation. The variance explained for the PIVS is measured by R 2 , estimated in a linear regression with the PIVS as only variable and the respective diastolic function trait as endpoint.
Next, we conducted a PheWAS using the obtained PIVS (see above and Supplementary Material for a full definition of included phenotypes in the PheWAS). Evaluation of the PIVS were performed in the European non-imaging cohort, i.e. an independent set of subjects compared to the PIVS construction set. Only results are shown that are significant after multiple testing correction at α = 0.05 (FDR, calculated per diastolic function trait) and, as a sensitivity analysis, for which all leave-one-SNP out cross validations analysis lead to a significant result at α = 0.05 after multiple testing correction (FDR) for the number of considered phenotypes. The latter condition is supposed to exclude spurious results which are only driven by one single variant. Leave-one-SNP out cross-validation is performed by excluding one SNP from the list of candidate variants, then re-estimating the PIVS and performing the PheWAS as described above. For the leave-one-SNP out cross-validation, FDR adjustment is performed per combination of diastolic trait and phenotype, considering the number of included SNPs.

Mendelian randomization
For exploring the causes and consequences of diastolic function parameters, we used a bidirectional Mendelian randomization (MR) approach, i.e. two MR analysis are performed: first, an MR analysis using the first chosen trait as exposure is conduced and secondly a MR analysis using the selected second trait is run. By considering both results, evidence can be gathered for a one-directional causal relationship, a bi-directional causal relationship or no causal relationship at all. We performed this analysis taking into account one diastolic and one non-diastolic function trait and for that, we selected non-diastolic function traits of interest by taking into account the results from the observational correlation analysis and clinical expertise. This approach lead to the consideration of six dichotomous risk factors associated with diastolic dysfunction, Arteriosclerosis, Atrial Fibrillation, Heart failure, Hypertension and Diabetes -considering Type I and Type II separately. Further, we considered four physiological variables as potential causes or consequences of changes in diastolic function, as well as five quantitative lipid traits as surrogate for arteriosclerotic risks as potential confounder source for changes in diastolic function. In total we analysed 15 non-diastolic phenotypes and the 3 diastolic phenotypes in our MR.
We established a workflow for the MR analysis which briefly described in this section. Full details are provided in the Supplementary Material. Genetic instrumental variables were selected from the UK Biobank GWAS results generated -as described above-via clumping with PLINK1.9 as described for the PIVS approach. The candidate SNP set prior to clumping was restricted to the intersection between the SNP sets of the pair of GWAS results (hypothesised causal trait GWAS and hypothesised consequence trait GWAS). A full list of the instrumental variables is contained in supplementary table file SupplementaryTable_InstrumentalVariantsMR.xlsx.
We aimed to remove potential confounding instruments by two filtering steps. First, we ran phenotype association analysis to identify and remove instruments that associate significantly with any of the traits Arteriosclerosis, Triglycerides, Apoliprotein B and LDL-Cholesterol. Second, we ran Steiger Filtering to remove instruments with potentially wrongly inferred causal directions.
All MR analysis are based on the point estimates and standard deviations obtained from the respective GWAS. We follow a similar approach to van Oort et al. 58 by using inversevariance weighted method (IVW) as the main analysis and applying several other MR methods for ensuring robustness of the obtained results as sensitivity analyses. We used weighted median-based methods, MR-PRESSO and MR-Egger. Consistent effect estimates across the different methods improves our confidence in a truly causal effect. We consider an association as "potential causal" if the main analysis indicates a causal relationship (P < 0.01), at least two of the sensitivity analyses indicate at least a suggestive causal relationship (P < 0.05) and none of the sensitivity analyses indicate associations with inconsistent effect directionality, i.e. none of the methods showed a suggestive association with conflicting directionality (P < 0.05). No explicit multiplicity adjustment is performed for MR experiments. For "potential causal" associations, we next conducted a supplementary sensitivity analysis using published GWAS results as described in the Supplementary Material -if published GWAS data was available.
All analysis, which involved diastolic and non-diastolic function traits, were conducted in a two-sample approach, i.e., the diastolic function trait GWAS was calculated in the full imaging cohort and the non-diastolic function trait GWAS was calculated in the non-imaging cohort.
For comparison of the effect estimates from the MR-analysis to the observed correlation of diastolic function measurement and disease status, we restricted the analysis population to subjects which were disease-free at the CMR visit. We then fitted a logistic regression model by coding subjects who experienced a first event of the selected disease during follow-up time as 1 and event-free subjects during follow-up as 0. As covariates, we included age at CMR visit, gender, diabetes status, diastolic blood pressure and BMI. Note that this analysis was only performed for relationships judged as potential causal and that involves a disease endpoint (and not a quantitative measurement like pulse rate).

NPR3 Pathway Analysis
In order to increase our understanding of the association of NPR3 with LAV max i and to further characterize the role of natriuretic peptides, we looked for additional genetic associations within genes of the natriuretic peptide pathway (so in addition to NPR3 -NPR1, NPR2, NPPA, NPPB, and NPPC). We conducted GWAS using BOLT-LMM for all imaging traits listed in Extended Data Table 1 as described above, as well as any non-imaging traits associated with rs1173727 (the lead variant for NPR3) across the 4 loci (NPPA and NPPB share the same locus). The GWAS summary statistics were filtered to a 1MB window around each gene (for NPPA/B, the gene used for centering was NPPA). Across these summary statistics, we performed clumping with a p-value threshold of 10 −5 and R 2 < 0.1.
For the identified tag SNPs and associated variants in LD from the clumping analysis, we then tested which of these variants we could confidently link to the natriuretic gene in the locus. If any variant was classified as missense, we selected that variant directly. For eQTL variants, we used colocalisation analysis to link these SNPS to the natriuretic genes in each locus. Relevant eQTL and pQTL data was used (eQTL summary statistics were taken from eQTL Catalog 59 and pQTL data from Sun et al. 60    Scatterplots of a) longitudinal peak diastolic strain rate (PDSR ll ) (n = 38,923) and b) radial peak diastolic strain rate (PDSR rr ) with age (n = 38,700); with density contours, linear model fit and marginal density plots. Violin plots of c) longitudinal (n = 38,923) and d) radial (n = 38,700) peak diastolic strain rate with sex; ****P<10 −16 (Wilcoxon signed-rank test). Boxplots show the median, hinges indicate interquartile ranges (IQR), and whiskers 1.5 × IQR.  a) Multiple linear regression analysis of left ventricular longitudinal peak diastolic strain rate (PDSR ll ), radial peak diastolic strain rate (PDSR rr ) and indexed left atrial maximum volume (LAV max i ) with age, sex body surface area (BSA) systolic blood pressure (SBP), pulse rate and diabetes as predictors. All associations were significant after false discovery rate correction. Data are presented as beta coefficient point estimates (95 % confidence intervals). b) Circular plot visualisation of the associations between the imaging (red -PDSR ll , PDSR rr , global systolic radial strain (E rr ), global systolic longitudinal strain (E ll ), ascending aortic (AAo) distensibility, descending aortic (DAo) distensibility, indexed left ventricular stroke volume (LVSV i ), left ventricular cardiac index (LVCI), LAV max i , indexed right ventricular stroke volume (RVSV i ), and right atrial ejection fraction (RAEF) and the non-imaging phenotypes (green for environmental; blue for biochemical). The strength of the connection between each pair is presented as a ribbon, whose size is proportional to their regression coefficient. All associations with a regression coefficient <0.3 are shown in faint colours (apart from the associations between PDSR ll , PDSR rr and LAV max i and all other phenotypes  Significant loci are labeled by their likely causal gene and lead SNP, see Table 1.    Table 1 Genome-wide Association Results. Summary information on the lead variants identified from each GWAS analysis and the significant genes from the Loss of Function analysis. For each significant locus across the 3 diastolic phenotypes, variant information, GWAS summary statistics and variant to gene annotation is provided. The evidence column is split by: MS -missense variant; eQTL -colocalisation between the GWAS signal and an eQTL for the gene in a plausible tissue type (see Supplementary Material); M -plausible mechanistic link between the gene and the measured heart phenotypes i.e. the gene function suggests a link to diastolic function; Overall -the confidence of variant to gene mapping given all the available evidence. Loci highlighted in grey are those that reached genome-wide significance in the discovery, validation, and full datasets, loci in white reach suggestive significance in the discovery dataset and genome-wide significance in the full dataset.