Transcriptomics of cardiac biopsies reveals differences in patients with or without diagnostic parameters for heart failure with preserved ejection fraction

Heart failure affects 2–3% of adult Western population. Prevalence of heart failure with preserved left ventricular (LV) ejection fraction (HFpEF) increases. Studies suggest HFpEF patients to have altered myocardial structure and functional changes such as incomplete relaxation and increased cardiac stiffness. We hypothesised that patients undergoing elective coronary bypass surgery (CABG) with HFpEF characteristics would show distinctive gene expression compared to patients with normal LV physiology. Myocardial biopsies for mRNA expression analysis were obtained from sixteen patients with LV ejection fraction ≥45%. Five out of 16 patients (31%) had echocardiographic characteristics and increased NTproBNP levels indicative of HFpEF and this group was used as HFpEF proxy, while 11 patients had Normal LV physiology. Utilising principal component analysis, the gene expression data clustered into two groups, corresponding to HFpEF proxy and Normal physiology, and 743 differentially expressed genes were identified. The associated top biological functions were cardiac muscle contraction, oxidative phosphorylation, cellular remodelling and matrix organisation. Our results also indicate that upstream regulatory events, including inhibition of transcription factors STAT4, SRF and TP53, and activation of transcription repressors HEY2 and KDM5A, could provide explanatory mechanisms to observed gene expression differences and ultimately cardiac dysfunction in the HFpEF proxy group.

with poor prognosis 8 comparable to HFrEF. In contrast to HFrEF, there is currently no available evidence based therapy for HFpEF 9 , which may be explained by different pathophysiology between these HF conditions. It was proposed that co-morbidities such as ischemia, hypertension and diabetes may be the drivers of disease progression in HFpEF 10 through diverse mechanisms more specific for HFpEF than HFrEF involving coronary microvascular inflammation and endothelial dysfunction leading to intra-and extracellular rearrangements. This has been demonstrated in incomplete relaxation of myocardial strips 11 and increased passive cardiac stiffness by titin changes and increased interstitial diffuse fibrosis 10,12 . Patients undergoing coronary bypass grafting (CABG) commonly have disturbances in LV function and thus may serve as models to study HF particularly since myocardial biopsies obtained during surgery provide a unique opportunity to study the gene expression in the different types of heart failure.
At the molecular level, gene expression programs are systematically regulated by transcription factors, chromatin regulators and other factors that are important for the establishment and maintenance of the cell state. Dysregulation of these programs can result in different diseases 13 . Studies of gene expression in heart tissue has a great potential to uncover the underlying molecular mechanisms leading to HF. In this regard, previous few studies have primarily attempted to identify gene expression differences between failing hearts (HFrEF) and normal hearts using RNA-seq analyses with limited number of samples [14][15][16] . There are also reports on RNA-seq expression in a mouse model for cardiac hypertrophy 17 .
In this study, we hypothesised that patients with HFpEF characteristics would show distinctive gene expression compared to patients with Normal physiology. Myocardial biopsies were obtained from patients undergoing elective CABG who all had LVEF ≥ 45% 1 . We performed RNA-seq based transcriptomics analysis in order to identify genes that are dysregulated in HFpEF compared to Normal.

Results
patients. From a total of 16 patients, 5 were prospectively classified as HFpEF proxy and 11 as Normal physiology. The clinical patients' characteristics are summarised in Table 1. The majority were male with a median age of 75 years in the HFpEF proxy group and 65 years in the Normal physiology group (p-value = 0.089). Three patients had a clinical diagnosis of heart failure, all in the HFpEF proxy group (patients 9, 11 and 15, Table 2). A history of hypertension (100% and 82%) and diabetes (60% and 45%) was present in HFpEF proxy and Normal physiology groups, respectively. One patient with Normal physiology had a history of myocardial infarction. Echocardiographic variables revealing HFpEF proxy or Normal physiology are shown in Table 2.
Gene expression profiles of LV tissue in HFpEF proxy and Normal physiology groups. The transcriptome sequencing resulted in an average of 26 million paired-end reads per biopsy sample. Number of mapped reads ranged from 17 to 30 million with an average of 24 million (Fig. 1A) which covered more than 85% of the sequenced reads. Normalisation and batch correction was performed using Trim mean of the M-values (TMM) and ARSyNseq, respectively, after filtering out lowly expressed genes ( Supplementary Fig. S1). The majority (~90%) of the expressed genes were protein coding while less than 5% were detected as long noncoding RNA (lncRNA) or antisense RNA (Fig. 1B, Supplementary Fig. S2).
We first characterised differences between the gene expression profiles of the 16 samples, utilising an unsupervised classification method -Principal Component Analysis (PCA). PCA using batch corrected normalised gene expression revealed two clusters, corresponding to HFpEF proxy and Normal physiology along the first principal component 1 (PC1). The PCA model revealed that the largest variation of the PC1 could explain 22% of the variation while principal component 2 explained 14%. Additionally, any bias between these groups and sequencing batches was investigated and no batch effect contributing to this clustering was found (Fig. 1C).
The Orthogonal Projections to Latent Structures Discriminant Analysis (OPLS-DA) model with 7-fold cross-validation depicted in Fig. 1B distinguished HFpEF proxy from Normal physiology using the first predictive component. An S-plot was generated to identify signature genes in both groups (Fig. 1D). In the S-plot, the magnitude of the contribution of each gene to the OPLS-DA model (p [1]) was plotted against significance of the corresponding genes (p(corr) [1]). As can be seen in Fig. 1E, genes (marked in red) contributing most to the discrimination of HFpEF proxy were MYH7, MYBPC3, TCAP, PTGDS, BSG, ALDOA, EEF2, IDH2, CRIP2, EEF1A2, PLN, CYCS, MYOZ2, MDH1 and DCN. Several of these genes were also found among the significantly differentially expressed genes ( Fig. 1F; see below). Genes dysregulated in HFpEF proxy patients. Differentially expressed genes (DEGs) were identified using NOIseqbio with a false discovery rate (FDR) < 0.05. Our analysis identified 743 DEGs discriminating between HFpEF proxy and Normal physiology whereof the majority were down-regulated in HFpEF proxy samples compared with Normal physiology samples ( Fig. 2A). The distribution of fold change as a function of mean expression difference in all the cases revealed that more than 90% of significant DEGs had fold changes ranging from 1.30 to -2.60 ( Fig. 2A,B). For comparison, we also calculated DEGs using FDR < 0.1 (Fig. 2C,D), resulting in 328 up-regulated and 1285 down-regulated genes in HFpEF proxy samples. The functional analysis (cf. below) of these DEGs showed similar results to those using FDR < 0.05, and we therefore kept the more stringent FDR < 0.05 DEGs in the current analysis.
Among the 743 DEGs, 69 genes were transcriptional regulators, of which 67 were down-and 2 up-regulated. A comparison between the predicted DEGs with the genes in the S-plot derived from OPLS-DA showed that most DEGs with few exceptions are located at largest distances from the axes centre. Thus, the dysregulated genes were predicted by two independent methods (Fig. 1F). The complete list of up-and down-regulated genes is given in the Supplementary Table S1.
The top biological processes associated with down-regulated genes in the HFpEF proxy group were cardiac muscle contraction and extracellular matrix assembly/organization (Fig. 3A, Supplementary Table S2). Enriched www.nature.com/scientificreports www.nature.com/scientificreports/ Gene Ontology (GO) terms reflecting molecular function revealed genes for cadherin binding, kinase binding, actin filament and integrin binding (Fig. 3B, Supplementary Table S2).
Upstream regulators and regulatory effect networks. The Ingenuity Pathways Analysis (IPA) analysis suggested that out of the total set of 736 DEGs (7 Ensembl gene IDs were not recognised by IPA and hence excluded from this analysis), 381 DEGs (52%) belong to known upstream regulators predicted to be activated or inhibited (activation Z-score ≥ ±2), at a Fisher's Exact test p-value ≤ 0.05 18 . A comprehensive list of 52 upstream regulators along with predicted activation states is found in Supplementary Table S3. Of these regulators, 34 were predicted to be inhibited and 18 were to be activated (Fig. 4A).
Among the upstream regulators predicted in our analysis to be inhibited in the HFpEF proxy group were transcription factor STAT4, HTT and tumour suppressor TP53 (Fig. 4), the latter regulating a total of 96 DE genes (13% of the DE gene set). Additional predicted inhibited upstream regulators included specific transcription factors (SRF, MYOD1, FOS and HIF1A; Supplementary Fig. S3). Furthermore, histone demethylase KDM5A and the basic helix-loop-helix (bHLH) type transcription factor HEY2 (Fig. 4), both transcriptional repressors involved in Notch signalling, were among the identified activated upstream regulators.
Based on the predicted upstream regulator, a total of 8 regulatory effect networks were identified (Supplementary Table S4). These networks depict potential paths by which activation or inhibition of specific transcription factors lead to impaired cardiac function, heart failure and other heart diseases (networks 1-4,7), as well as impaired vessel formation/endothelial cell function (networks 3,5,6,8). The most causally consistent and densely connected network is shown in Fig. 5.

Discussion
To our knowledge, this is the first study using RNA-seq to identify dysregulated genes in patients with HFpEF characteristics, as schematically summarised in Fig. 6. In this exploratory translational study of elective CABG patients undergoing perioperative myocardial biopsies, we found that patients in the HFpEF proxy group displayed distinctive gene expression compared to patients with Normal physiology. The top biological functions associated with down-regulated genes in HFpEF proxy patients were cardiac muscle contraction, oxidative phosphorylation, endocytosis/cell remodelling, matrix organization and fibrosis. Further, genes regulated by transcription factor STAT4 and tumour suppressor TP53 were found to be down-regulated.
patients. The patients investigated in this study were the initial group in whom the myocardial biopsies were obtained within the ongoing CABG-PREFERS study 1   www.nature.com/scientificreports www.nature.com/scientificreports/  [1], in relationship to its significance, p(corr) [1]. Genes contributing to the highest magnitude of the separation for the respective groups are highlighted in red/orange. Shaded boxes indicate up-regulated genes in the HFpEF proxy group (bottom orange box) and in the Normal physiology group (top green box). (F) The same plot as in "E" but genes overlapping with differentially expressed genes are highlighted with in red/orange. (2019) 9:3179 | https://doi.org/10.1038/s41598-019-39445-2 www.nature.com/scientificreports www.nature.com/scientificreports/ elective CABG. Hence, very few had a previous myocardial infarction or coronary intervention and few had a previous HF diagnosis. The three patients who had a HF diagnosis were all in the HFpEF proxy group.
HFpeF. HFpEF is more frequent today, which may be due to increasing life span of the population, improved survival after myocardial infarction and increasing rates of HF risk factors like hypertension, overweight, and diabetes. However, the pathophysiology of this disease in not well understood at the transcriptome level. Already in the 1980s, it was recognised that ischemia might lead to diastolic dysfunction. We identified HFpEF characteristics in 31% of the group of patients planned for elective CABG, implying that other prevalent comorbidities www.nature.com/scientificreports www.nature.com/scientificreports/ except coronary artery disease, such as hypertension and diabetes may also play a role for development of HFpEF suggesting a link to microvascular dysfunction 19 . Imaging. HFpEF constitutes a diagnostic challenge and in an individual patient, there may be problematic measure overlaps and grey zones. Poor echocardiographic windows, tachyarrhythmias and atrial fibrillation makes measurements more difficult. The present guidelines advocate the use of at least 4 up to 8 parameters of structural LV dysfunction and diastolic dysfunction for diagnosis and risk prediction, some of these parameters may be used interchangeably 5,20 . In summary, the number of altered variables may increase the precision of the HFpEF diagnosis. In the current study we therefore used state-of-the-art guideline criteria for HFpEF and a majority of the 4-8 criteria achieved in an individual patient should be positive for rendering a HFpEF proxy diagnosis. Our definition is further strengthened by the fact that we used a consensus method in ambiguous patients. Although the five HFpEF proxy patients fulfilled the defined criteria they only had mild dysfunction and low NT-proBNP levels. However, we believe that such early stage HFpEF is highly relevant since these patients seem more sensitive to treatment with RAAS blockade 21 . Thus, RNA sequencing may be of particular value in early or mild HFpEF patients for finding new pathophysiologic translational mechanisms.

Genes dysregulated in HFpEF proxy patients.
To get an overview of the gene expression profiles, we performed PCA and OPLS-DA to distinguish HFpEF proxy patients from those with Normal physiology. Our results (Fig. 1C,D) suggest that the two groups have different gene expression profiles. The downregulated genes (Supplementary Table S1) were analysed with respect to enrichment of GO terms (Fig. 3A,B), revealing that cardiac muscle contraction, oxidative phosphorylation, endocytosis and extracellular matrix organization were associated with the dysregulated genes (Fig. 3A, Supplementary Table S2). These first two GO terms were also found to be enriched in a mouse model of pathological hypertrophy 17 .
Downregulated cardiac contraction genes may correlate with impaired systolic pump function in HFpEF patients as has been described in previous studies 22 . The gene TNNT2 encodes for the tropomyosin binding subunit in the troponin complex which is located in the thin filament of the striated muscle and regulates muscle contraction in response to alteration in the intracellular calcium ion concentration 23 . However, we found no differences between the two groups for measures of systolic function at rest. EF and global longitudinal strain (GLS) were similar in the two groups. The described downregulated genes for myocardial contraction may be related to more long-term changes caused by passive cardiac stiffness by fibrosis or oxidative titin changes 12 .  www.nature.com/scientificreports www.nature.com/scientificreports/ Our study shows that RASD1, coding for ras-related dexamethasone-induced protein 1, is downregulated in HFpEF proxy patients. RASD1 is a modulator of cardiac endocrine function in response to volume overload underlying anti-natriuretic peptide excretion such as atrial natriuretic peptide (ANP) and brain natriuretic peptide (BNP) 24 . RASD1 has also been shown to be downregulated in volume overloaded rat hearts 24 .
Furthermore, PLN, coding for phospholamban was upregulated in HFpEF proxy patients. PLN is a pentamer and a major substrate for the c-AMP dependent kinase in cardiac muscle. Phospholamban modulates the activity of the sarcoplasmatic reticulum ATPase type 2 (SERCA 2a), which in turn modulates Ca 2+ handling by the sarcoplasmatic reticulum and increases both contractility and relaxation, at least in studies of electric cardiac contractility modulation (CCM) therapy 25 . In contrast, these genes have been seen to be downregulated in HFpEF 26 .
During the cardiac cycle there is an active coupling between contractility during systole and relaxation during early diastole. Cardiac contraction and relaxation are both closely linked active energy dependent processes, i.e. in the ischemic cascade. The finding of downregulated genes in the HFpEF proxy group for oxidative  www.nature.com/scientificreports www.nature.com/scientificreports/ phosphorylation and energy supply may thus be considered crucial for development for HFpEF as regards both systolic and diastolic cardiac function and in relation to ischemia in these patients. Further, transcriptional changes may develop more slowly and be counteracted by short-term alterations of contractility and relaxation by changes in neurohormonal activation and sympathetic tone, calcium fluxes, and ischemia 27 . Sympathetic tone is increased in both aging and heart failure, by increasing circulating catecholamines but also decreased β-adrenoreceptor sensitivity 28 . Calcium handling in diastole is essential to remove Ca 2+ from the cytosol to ensure cardiomyocyte relaxation by SERCA back into the Sarcoplasmatic reticulum 29 . Traditional RAAS-and betablockade does not show benefits in HFpEF, but new knowledge of role of β-arrestins and G-protein coupled receptor kinases (GRKs) may open new targets for treatment of HFpEF 30 .
Our results show several downregulated genes involved in extracellular matrix assembly, which may influence remodelling and dilatation of the heart rather than increasing passive myocardial stiffness as shown for HFpEF 12 due to increased synthesis of collagens typically with predominance for collagen I and III in the myocardium. This finding is therefore surprising and counterintuitive. However, the process of myocardial fibrosis is complex and involves dynamics of fibrous tissue turnover including sevveral steps; active synthesis, crosslinking and active degrading of collagen 31 . The genes LAMA5 and LAMB2, coding for two laminin subunits (α5 and β2), were also downregulated in HFpEF proxy patients. The secreted protein acidic and rich in cysteine (SPARC) is a matrix-cellular collagen binding protein serving a key role in collagen assembly into the extracellular matrix. Recent studies demonstrated that disruption of the SPARC gene is associated with decreased capacity to generate organised, mature collagen fibres 32 .
ANKRD1 is a transcription factor known to interact with sarcomeric proteins in the myofibrillar stretch sensor system 33 . It has been observed that the expression of ANKRD1 in both transcription and protein levels were increased in failing heart 15,34 . In our data we also see a trend of upregulated mRNA expression in the HFpEF proxy patients.
We find that the gene LUM coding for lumican is among the most up-regulated genes in the HFpEF proxy group compared to Normal physiology. Lumican is an extracellular matrix localised proteoglycan associated with inflammatory conditions known to bind collagen. In a recent study, cardiac lumican was increased in experimental and clinical HF 35 . This study also indicated that inflammatory and mechanical stimuli induce lumican production by cardiac fibroblasts indicating a role in HF development. However, we are unaware of whether lumican has previously been shown to be up-regulated in CABG and HFpEF patients.
Upstream regulators and regulatory effect networks. The tumour suppressor TP53 has recently been described as an important regulatory factor in the heart 36 . In our study, 96 DE genes (13%) belong to those regulated by TP53 (Fig. 3B). It has been described to be increased in human dilated cardiomyopathy (DCM) 37 , suggesting that elevation of TP53 also plays a key role in the common path toward heart failure. There are data showing that TP53 inhibits angiogenesis by suppressing HIF-1 resulting in myocardial hypoxia and cardiac dysfunction which may be a novel molecular mechanism underlying transition of cardiac hypertrophy to HF 38 .
STAT4 is important in both innate and adaptive immune responses 39,40 . However, STAT4 did not show transcriptional repression in the HFpEF proxy group, indicating that additional mechanisms, e.g. post-translational modifications, protein-protein interactions, might be involved in providing the inhibitory effect.
We further hypothesised a regulatory effect network model for mechanistic understanding of the disease and dysregulated genes using the IPA regulatory effect tool. The network illustrates potential mechanism(s) by which transcription regulator activation (KDM5A, HEY) and inhibition (SRF, IFI16) may lead to impaired cardiac function (Fig. 5).
KDM5A (retinoblastoma-binding protein 2/RBP2) encodes a histone demethylase that is part of the core Notch-RBP-J repressor complex 41 and has implicated in transcriptional regulation of Hox genes and cytokines 42 . It also plays a role in tumour progression and selective inhibition blocks cancer cell growth 43 . Little is known regarding the role of this gene in heart. In a recent study, likely disease-causal KDM5A variants were uncovered in whole exome sequencing in patients with congenital heart disease (CHD) 44 .
HEY2 encodes a basic helix-loop-helix (bHLH)-type transcription factor that is preferentially expressed in the developing and adult cardiovascular system 45 . It acts as a transcriptional repressor downstream of Notch signalling pathway 46 and likely plays a central role in the cardiac transcriptional machinery 47 . For example, HEY2 expression levels influence cardiac hypertrophy and the progression to heart failure in response to pressure overload through modulation of apoptosis and GATA4 activity 48 . Our network analysis results suggest that activation of HEY2 may have contributed to cardiac dysfunction in HFpEF proxy via transcriptional repression of key cardiac transcription activators GATA4 and NKX2.5, among others.
Serum response factor (SRF) is a central cardiac transcription factor required for appearance of beating sarcomeres in the heart 49 . Based on our analysis, inhibition of SRF with accompanied downregulation of its target genes (GATA4, NKX2.5, MYH6, MYH7, TNNT2, TCAP and others) could be one additional explanatory mechanism underlying cardiac dysfunction in the HFpEF proxy group.
Although the exact function of p53 and IFN-inducible gene IFI16 is not currently known, it has been proposed to act as transcriptional repressor and tumour suppressor via activation of p53 signals and inflammasome 50,51 .
In addition to IPA, we also explored network-based approaches for transcription factors enrichment; ChEA-db which is transcription factor targets database inferred from integrating literature curated Chip-X data 52 and transcription factor protein-protein interaction networks 53 . The enrichment analysis revealed TP53, SMAD2, SMAD3 and ESR1 among the enriched transcription factors which were also predicted by IPA as upstream regulators.
Limitations and strengths. The strength of this study consists of the revealed gene expression differences between HFpEF and Normal groups found in myocardial biopsies. However, there are some limitations in the study, such as relatively small number of patients used and unequal distribution between HFpEF and Normal.
www.nature.com/scientificreports www.nature.com/scientificreports/ In order to obtain myocardial biopsies, we have chosen patients undergoing elective CABG enabling us to safely obtain tissue samples. Based on data from the SWEDEHEART (Swedish Web-system for Enhancement and Development of Evidence-based care in Heart disease Evaluated According to Recommended Therapies) registry we knew that around 20-30% of elelective CABG patients had LVEF ≥ 45%. To identify the HFpEF group, we used echocardiographic evaluation according to current international Guidelines. We could confirm that a proportion of patients indeed had HFpEF characterstics. Due to limited amounts of biopsy material, we had to use the entire biopsy to get deeper RNA sequencing for better sensitivity. Consequently, we were unable to perform any validation experiments which may be considered a limitation. A larger cohort of patients, which we aim at in the future, will add more insight into differences in gene expression between these two groups.
Conclusions. This exploratory study could confirm our hypothesis that patients undergoing CABG with HFpEF characteristics compared to patients with Normal physiology had distinctive gene expression in cardiac biopsies with downregulated genes for myocardial contraction, energy supply, remodelling and fibrosis. We consider differences within these functional areas relevant for possible pathophysiological mechanisms in HFpEF. However, down-regulation of these functions in patients with HFpEF characteristics is complex to describe and understand. Our findings lend to support our further studies in a larger patient cohort to find pathophysiological mechanisms that can explain and ultimately lead to treatment for HFpEF.

patients.
Patients enrolled were scheduled for elective CABG without concomitant valve surgery and with preserved LVEF. They all had angina pectoris with or without a previous myocardial infarction. Cardiac biopsies were obtained during CABG for analysis of mRNA expression in the myocardial tissue. All patients were assessed at a baseline visit 4-8 weeks prior to CABG by clinical characteristics, echocardiography and blood sampling including natriuretic peptides. From the ongoing study CABG-PREFERS 1 we now report data from the initial patients.
Descriptive data are presented as median and quartiles (Q1;Q3) or number (%), and comparisons between groups were performed by Wilcoxon rank sum test and Fisher's exact test as appropriate.
Definitions. Preserved LVEF was defined at the time of study design as LVEF ≥ 45% 1 . The patients were divided into two groups according to echocardiography, NTproBNP levels and HF guidelines definitions 5 . The group with echocardiographic characteristics and increased NTproBNP levels indicative of HFpEF 1,5 was called HFpEF proxy for the purpose of this study and was used as a representative for HFpEF even when not showing signs or symptoms of heart failure. The Normal physiology group had LVEF ≥ 45% and no echocardiographic signs of HFpEF 54 .
The HFpEF proxy definition was based on LVEF ≥ 45% and the combination of the following five echocardiographic criteria; 1. left atrial volume indexed for body surface area (LAVI) > 34 ml/m 2 , 2. LV mass index ≥115 g/m 2 for males or ≥95 g/m 2 for females, 3. ratio of early mitral inflow wave velocity (E) to myocardial tissue early diastolic wave velocity (e′) defined as E/e′ ≥ 8, 4. e′ septal <0.07 m/s or e′ mean septal/lateral <0.09 m/s, 5. tricuspid regurgitation velocity >2.8 m/s, and additionally NTproBNP > 125 ng/L. Three abnormal criteria of these five were required to fulfil the definition (majority rule). In equivocal cases, classification was performed by consensus of two experts (M.J.E. and H.P., blinded for clinical characteristics) in line with previous experiences from the CHARM substudy 20 .
Echocardiography. Transthoracic Doppler echocardiography was performed according to guidelines as previously reported 1 . A Vivid 9 ultrasound system (Vingmed-General Electric, Horten, Norway) was used in all studies. Images were digitally stored on a dedicated server, and data analysis was performed offline on the EchoPAC workstation (GE EchoPAC sw only, Norway) by one experienced sonographer. The mean value of 3 cardiac cycles was calculated for each variable.
Tissue collection. From patients undergoing CABG, core needle biopsies were taken from the lateral wall of the left ventricle before initiation of cardiac arrest and stored in −70 °C as previously described 1 and used for mRNA analysis. Patients were prepared for surgery according to standard clinical routines with placement of a central venous line in the internal jugular vein, an arterial line in the distal radial artery and a peripheral venous line in the brachial vein. A midline sternotomy was performed and one or two mammary arteries were procured for usage as conduits 55 .
RNA extraction and sequencing. Total RNA was extracted using the RNeasy Fibrous Tissue Mini Kit (#74704, Qiagen). RNA libraries for sequencing were prepared using poly-A selection and the Illumina RNA strand-specific TruSeq Stranded mRNA Sample prep kit with 96 dual indexes (Illumina, CA, USA) according to the manufacturer's instructions with the following changes. The protocols were automated using an Agilent NGS workstation (Agilent, CA, USA) using purification steps as described 56 Analysis of transcriptome (RNA-Seq) data. Whole transcriptome sequencing was performed for each biopsy. Initial quality checking of the sequencing raw reads was performed to identify potential outliers before doing further analysis using FastQC. Sequencing paired-end reads were mapped towards the human reference genome (version GR38) using Star Aligner 58 with default options, and Ensembl genome annotation (version (2019) 9:3179 | https://doi.org/10.1038/s41598-019-39445-2 www.nature.com/scientificreports www.nature.com/scientificreports/ 37) was used for subsequent analysis. Reads that mapped to the exons of the coding genes were counted using HTSeq. 59 . Genes with count values of zero (i.e. no read detected) in all samples were filtered out before further analysis. Genes were then categorised into different biotypes and distribution over the reference genomes was calculated. Count data was also investigated for several potential biases such as RNA-degradation, GC content etc. using NOISeq. 60 package in R (http://www.R-project.org). Before normalisation, lowly expressed genes were filtered out using proportion test per condition and multiple testing correction as per NOIseq manual 60 . Normalisation of the count data was performed using the Trim mean of M-values (TMM) approach 61 .
Batch correction of the data was performed using ARSyNseq function of the NOISeq package 60 . Differential gene expression analysis was done using the function noiseqbio with parameters k = 0.5, norm = 'n' , lc = 0, r = 20, adj = 1.5, a0per = 0.9, which is recommended for clinical RNAseq samples 60 . For NOISeq, we set the parameter 'q' to 0.95, corresponding to a false discovery rate (FDR) < 0.05. Analysis and plots were generated in R environment using ggplot2.
Principal component analysis (PCA) was performed on the log 2 transformed values of normalised batch corrected expression values using "princomp" function in R environment.
Orthogonal projections to latent structures discriminant analysis (OPLS-DA) was performed using SIMCA v14.1 (Umetrics, Umeå, Sweden) on the already normalised and batch corrected data to identify genes showing high variation in a pairwise manner. The Pareto scaling method was used in this case, since it reduces the relative importance of large values, but keeps data structure partially intact, and we wanted to detect small to medium feature differences.
Clustering analysis of the differentially expressed genes (DEGs) was performed using unsupervised hierarchical clustering. Normalised expression data were standardised before plotting as heatmap using pheatmap package for R environment.

Functional analysis of the dysregulated genes. Gene set enrichment analysis for Gene Ontology (GO)
terms with focus on biological process (BP) and cellular component (CC) was performed for the DEGs (using gene symbols as input) using Enrichr 53 with the probability density function as p-value model. The enrichment was tested using Fisher's exact test with corrected p-value < 0.05. The DEGs were further analysed through the use of IPA 18 (Ingenuity Pathways Analysis; QIAGEN Inc., https://www.qiagenbioinformatics.com/products/ ingenuity-pathway-analysis). This tool uses the information in the Ingenuity ® Knowledge Base to assess signalling and metabolic pathways, upstream regulators, regulatory effect networks, and disease and biological functions that are likely to be perturbed based on a data set of interest (in our case, the DEGs). The IPA upstream analysis 18 was performed to predict which regulators (i.e. any gene, protein or miRNA) that are activated or inhibited based on a calculated Activation Z-score, to explain the observed DEG changes in HFpEF proxy group vs. Normal physiology group. The IPA regulatory effect network analysis generated hypotheses for how a phenotype, function or disease is regulated by activated or inhibited upstream regulators. In our study, regulatory effect network analysis was used to specifically study the impact of the identified upstream transcription factor regulators on downstream heart disease functions, given the observed gene expression changes in HFpEF proxy group vs. Normal physiology group.
Ethics statement. This study was conducted according to the Declaration of Helsinki and approved by the regional Ethics Committee Stockholm. All patients were included following oral and written informed consent.

Data Availability
RNA-seq data have been deposited in the EMBL-EBI ArrayExpress database (www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-7454.