Introduction

Heart failure (HF) with preserved ejection fraction (HFpEF) accounts for more than half of all the patients diagnosed with HF. There is a worldwide increase in its prevalence with limited proven treatment, making it an important healthcare problem of modern cardiology1. Exercise intolerance is a primary manifestation in patients with HFpEF and is associated with symptoms of dyspnea, abnormal central hemodynamics, and a poor quality of life2,3,4. The identification of reduced aerobic capacity may have potentially important therapeutic implications in people with HFpEF.

Recent studies have proposed two multiparametric scoring systems to help diagnose HFpEF in patients with dyspnea. First, the H2FPEF score that is based on four clinical variables (the body mass index [BMI], use of two or more antihypertensive medicines, atrial fibrillation [AF], and age) and two echocardiographic variables (the ratio of the early diastolic mitral inflow velocity [E] to early diastolic mitral annular tissue velocity [e′] [E/e′ ratio] and pulmonary artery [PA] systolic pressure [PASP])5. Second, the HFA-PEFF score that employs a complex approach by scoring the natriuretic peptide levels and the echocardiographic findings of cardiac function and structure6. In addition to the diagnostic value, recent studies have demonstrated the association between the two algorithms and the clinical outcomes in patients diagnosed with HFpEF7,8,9,10. However, few data are available regarding whether these diagnostic schemes could predict aerobic capacity in patients with HFpEF.

Accordingly, we examined the performance of the two algorithms for predicting exercise capacity and the echocardiographic estimates of intracardiac pressures during ergometry exercise echocardiography.

Results

Subject characteristics

Of 187 participants, 83 patients were found to have HFpEF, and 104 had non-cardiac dyspnea (controls). Of the 83 patients with HFpEF, 26 patients were diagnosed through invasive catheterization, 24 with the American Society of Echocardiography/European Association of Cardiovascular Imaging (ASE/EACVI) criteria for diastolic dysfunction, and the remaining 33 due to an elevated E/e′ ratio during exercise. Comparisons of the clinical profiles, the HFpEF scores, and the exercise capacity across the different definitions of HFpEF are presented in Supplemental Table 1. The clinical characteristics of the 83 patients with HFpEF and 104 controls are shown in Table 1. As compared to the control groups, patients with HFpEF were older and had a higher prevalence of systemic hypertension, diabetes, coronary artery disease, AF, and elevated natriuretic peptide levels. Sex, BMI, and vital signs were similar between the two groups. Patients with HFpEF were treated with angiotensin-converting enzyme inhibitors or angiotensin-receptor blockers, beta-blockers, and loop diuretics more frequently than the control group.

Table 1 Baseline characteristics.

As per resting echocardiography, patients with HFpEF had a larger left ventricular (LV) mass index, relative wall thickness, and left atrial (LA) volume index and poorer LV shortening evidenced by lower longitudinal strain and the systolic mitral annular tissue velocity at the septal annulus (mitral s′) compared to control subjects (Table 1). The LV diastolic function was poorer in patients with HFpEF, with lower mitral e′ velocity and a higher E/e′ ratio as compared to the controls. Patients with HFpEF displayed a higher estimated PASP and right atrial pressure (RAP) and lower velocity at the lateral tricuspid annulus (TV s′) than controls. EF, cardiac output, and arterial-venous oxygen content difference (A-VO2 diff) did not differ between the groups.

The H2FPEF and HFA-PEFF scores

Since data on E/e′ ratio and PASP were available in all participants, the H2FPEF score was calculated for all the patients enrolled in this study; however, the HFA-PEFF score could not be calculated for 23 patients (feasibility, 88%) because of missing data for natriuretic peptide levels. Among the entire cohort, subjects were likely to be classified as low or intermediate probabilities based on the H2FPEF score and as high probability based on the HFA-PEFF score (Fig. 1). As expected, both H2FPEF and HFA-PEFF scores were higher for patients with HFpEF than controls (Table 1). Both high H2FPEF (6–9 points) and HFA-PEFF (5–6 points) scores showed very high positive predictive values (100% and 83%) to diagnose HFpEF while low scores (H2FPEF: 0–1 points and HFA-PEFF: 0–1 points) displayed very high negative predictive values (94% and 100%) (Supplemental Table 2).

Figure 1
figure 1

Distribution of H2FPEF and HFA-PEFF scores among all participants. The H2FPEF score was more likely to classify subjects into a low or intermediate probability while the HFA-PEFF score categorized them as a high probability.

Exercise capacity and echocardiographic findings during peak exercise

An expired gas analysis was performed simultaneously in 53 patients with HFpEF and 54 control subjects. As compared to the controls, peak exercise workload was lower, exercise duration was shorter, and peak VO2 was lower in HFpEF patients (Table 2). During the period of peak exercise, heart rate was lower in patients with HFpEF as compared to the controls while the systolic blood pressure, oxygen saturation, and A-VO2 diff were similar between the groups. Differences in the mitral e′ and s′ tissue velocities and the E/e′ ratio between the groups increased further during the peak exercise. Compared to the control subjects, HFpEF patients had a lower cardiac output (CO) and TV s′ and a higher PASP during the peak exercise. The severity of the e′ and s′ velocities, E/e′ ratio, CO, A-VO2 diff, and TV s′ were consistently associated with poor exercise capacity (Fig. 2 and Table 3).

Table 2 Exercise capacity and echocardiographic measures during peak exercise.
Figure 2
figure 2

Correlations between echocardiographic measures and exercise capacity. Decreases in mitral annular e′ velocity and cardiac output (CO) during exercise were moderately correlated with lower peak oxygen consumption (VO2). HFpEF, heart failure with preserved ejection fraction (HFpEF).

Table 3 Correlations between echocardiographic measures of hemodynamics and exercise capacity.

Correlations of the HFpEF scores with the echocardiographic findings during peak exercise and the exercise capacity

The H2FPEF score correlated with the echocardiographic findings during peak ergometry exercise (Table 4). Particularly, a higher H2FPEF score was associated with a lower LV systolic function (s′ velocity, r =  − 0.46, p < 0.0001), higher E/e′ ratio (r = 0.49, p < 0.0001), and lower RV systolic function (TV s′, r =  − 0.44, p < 0.0001) during peak exercise. These correlations remained significant after adjusting for the age, sex, BMI, and presence of hypertension (all p < 0.05). There were similar correlations between the HFA-PEFF score and the echocardiographic findings (Table 4). After adjusting for the age, sex, BMI, and presence of hypertension, the HFA-PEFF score remained significantly associated with mitral e′ and s′ velocities and E/e′ ratio (p < 0.01); however, this was not the case for CO, TV s′, and PASP (p > 0.05). The H2FPEF and HFA-PEFF scores did not correlate with A-VO2 diff during exercise.

Table 4 Correlations of the two HFpEF Scores with exercise capacity and echocardiographic measures during peak exercise.

Peak VO2 and exercise duration decreased with an increase in the H2FPEF score (Fig. 3, r =  − 0.40 and − 0.32, p < 0.0001). As expected, both peak VO2 and exercise duration were related to age (r =  − 0.27, p = 0.005 and r =  − 0.41, p < 0.0001), and the H2FPEF score remained significantly associated with peak VO2 and the exercise time even after adjusting for the age, sex, BMI, and presence of hypertension (p = 0.03 and p = 0.0002). The H2FPEF score demonstrated a good discriminatory ability for identifying reduced aerobic capacity (peak VO2 < 14 mL/min/kg) (area under the curve [AUC] 0.71, 95% confidence interval [95%CI] 0.59–0.80, p = 0.0005: Fig. 4). An H2FPEF score of ≥ 6 displayed a high specificity (100%) and low sensitivity (15%) for the identification of reduced peak VO2, whereas an H2FPEF score of ≤ 1 had high sensitivity (96%) but very poor specificity (12%). Among components of the H2FPEF score, the age, BMI, E/e′ ratio, and treatment with 2 or more antihypertensives were related to the peak VO2 but they had modest relationships (Table 5). The HFA-PEFF score correlated with exercise duration (r =  − 0.26, p = 0.0008), but not with the peak VO2 (r =  − 0.19, p = 0.07). The natriuretic peptide level domain was the only component of the HFA-PEFF score associated with peak VO2 (Fig. 5). The HFA-PEFF score did not predict reduced aerobic capacity (AUC 0.61, 95%CI 0.49–0.73, p = 0.10: Fig. 4).

Figure 3
figure 3

Correlations between HFpEF diagnostic algorithms and exercise capacity. The H2FPEF score was correlated with peak VO2, but the HFA-PEFF score was not. Abbreviations as in Fig. 2.

Figure 4
figure 4

Receiver-operating characteristic curves of the H2FPEF and HFA-PEFF scores to predict impaired exercise capacity. AUC, area under the curve.

Table 5 Correlations of the components of H2FPEF and HFA-PEFF scores with exercise capacity.
Figure 5
figure 5

Peak oxygen consumption according to the HFA-PEFF domain scores. (A) Peak VO2 did not differ among the HFA-PEFF functional domain scores. (B) Peak VO2 was similar among the HFA-PEFF morphological domain scores. (C) In contrast, patients with a natriuretic peptide domain of 2 points displayed lower peak VO2 compared to the other groups. *p < 0.05 vs. 0 point, #p < 0.05 vs. 1 point. Abbreviations as in Fig. 2.

A sensitivity analysis performed for patients with HFpEF and the control subjects separately is presented in Supplemental Table 3. The H2FPEF score correlated with mitral s′, TV s′, peak VO2, and exercise duration in both HFpEF patients and controls; however, some of the correlations failed to remain significant possibly due to the small sample size. The HFA-PEFF score correlated with mitral e′ and s′ velocities, the E/e′ ratio, and exercise duration in the control subjects, and it correlated with mitral s′ and peak VO2 in patients with HFpEF. There were significant differences in correlations in the H2FPEF score with E/e′ ratio and TV s′ between HFpEF patients and controls (p = 0.02 and 0.04 by Meng’s z-test).

Discussion

In this study, we examined the performance of the H2FPEF and HFA-PEFF scores for predicting exercise capacity and echocardiographic findings during exercise stress echocardiography. We observed that the HFA-PEFF score displayed relatively low feasibility because of the requirement of natriuretic peptide levels while the H2FPEF score displayed high feasibility. We further demonstrated that both H2FPEF and HFA-PEFF algorithms predicted a lower LV systolic and diastolic function, higher estimated LV filling pressure, lower CO, and reduced RV systolic function during peak ergometry exercise. Despite the requirement of a few input variables, the H2FPEF score provided a good discriminatory ability for identifying poor exercise capacity among patients with dyspnea while the HFA-PEFF score did not. These data provide new insights into the role of the H2FPEF and HFA-PEFF scores for predicting exercise intolerance and hemodynamic instability in patients presenting with exertional dyspnea.

Exercise intolerance is a hallmark in patients with HFpEF and studies have uniformly reported that peak VO2 is depressed in HFpEF2,3,11,12,13,14. Reduced exercise capacity is associated with symptoms of dyspnea and a poor quality of life, making it an important therapeutic target in this syndrome3,15. While multiple mechanisms can contribute to exercise intolerance in HFpEF, including abnormalities in the heart, lungs, and the periphery, a pathological increase in cardiac filling pressure developed during exercise stress may play an important role in limiting exercise performance in patients with HFpEF2,3,16,17,18. The current study demonstrated that the exercise capacity was reduced in patients with HFpEF compared to control subjects as shown by a lower peak exercise workload, shorter exercise duration, and decrease in the peak VO2. Compared to the controls, patients with HFpEF displayed lower mitral e′ and s′ tissue velocities, a higher E/e′ ratio, reduced CO, and lower TV s′ during the peak exercise and the severity of these abnormalities were consistently associated with a decrease in exercise capacity. These data confirm that abnormal exercise hemodynamics and a reduced CO reserve contribute to exercise intolerance in patients with HFpEF.

Two multiparametric scoring systems have been proposed to help diagnose HFpEF among patients with dyspnea. The H2FPEF score is developed among patients with unexplained dyspnea using the currently-recommended gold standard test (i.e., invasive exercise hemodynamic testing), and combines clinical characteristics and echocardiographic measures5. The HFA-PEFF score is a consensus-based approach that employs complex scoring systems based on echocardiographic indices and natriuretic peptide levels6. In the current study, the H2FPEF score was calculated for all patients while the HFA-PEFF score could not be calculated for 12% of the patients due to the lack of data for the natriuretic peptide levels. Although evaluation of the natriuretic peptide levels is a vital part of the diagnostic approach for HFpEF, they may not be measured in all patients with dyspnea6. On the contrary, echocardiography is performed in essentially all patients with a clinical suspicion for HFpEF19. It is easier for clinicians to calculate the H2FPEF score, with only a few variables than the HFA-PEFF score. Our data suggest that the H2FPEF score may have great significance in clinical practice. The current study also demonstrated that both high H2FPEF and HFA-PEFF scores showed very high positive predictive values (100% and 83%) to diagnose HFpEF while low scores displayed very high negative predictive values (94% and 100%). These data are consistent with the concept of the HFpEF scoring schemes5,6, and exercise stress testing is not required in the patients with very high or low scores. In patients with intermediate probability, exercise testing will be required. Further studies are warranted to directly compare the diagnostic value between the two scores using the gold standard of invasive exercise hemodynamic testing.

In addition to the diagnostic value, recent studies have shown the association between the two algorithms and the clinical outcomes in patients with HFpEF7,8,9,10. However, few data are available regarding whether these diagnostic schemes could predict aerobic capacity in HFpEF. In this study, we found that the H2FPEF score was associated with a reduced exercise capacity as assessed by both peak VO2 and exercise duration, which is consistent with the findings of a previous study20. In contrast, while the HFA-PEFF score modestly correlated with the exercise duration (r =  − 0.26), it was unrelated to the peak VO2. We further demonstrated that the H2FPEF score identified a reduction in the peak VO2; however, the HFA-PEFF algorithm did not. The plausible reason for this may be related to the inclusion of both clinical and echocardiographic variables in the H2FPEF score, but not in the HFA-PEFF score. It is clear that HFpEF is associated with comorbidities, including obesity, hypertension, diabetes, and AF12,13,14,15. Previous studies have demonstrated that a higher BMI and AF are related to decreased exercise capacity in patients with HFpEF12,14,21,22. The present study showed that individual components of the H2FPEF score were associated with peak VO2. In contrast, only LV longitudinal strain and B-type natriuretic peptides correlated with peak oxygen consumption among the components of the HFA-PEFF score. It is also worth pointing out that the H2FPEF score as a whole is more predictive of the peak VO2 than the individual components. These data support the usefulness of the H2FPEF score for not only diagnosing HFpEF but also for predicting exercise capacity and worse hemodynamics during exercise.

Reduced exercise intolerance is a cardinal manifestation in patients with HFpEF. Given its association with symptoms of dyspnea, abnormal central hemodynamics, and a poor quality of life2,3,4, impaired exercise capacity may reflect the severity of the HFpEF syndrome. Cardiopulmonary exercise testing provides valuable information about objective evidence of exercise capacity, as well as ventilatory performance and chronotropic and blood pressure responses during exercise; however, it may not be commonly performed in clinical practice. The current data suggest that clinicians might use the H2FPEF score to identify HFpEF with impaired exercise tolerance among patients with exertional dyspnea. This might also help determine patients who need aggressive treatment. Exercise capacity (VO2) is determined by cardiac (CO) and peripheral (A-VO2 diff) components of O2 transport. We demonstrated that the H2FPEF and HFA-PEFF scores did not correlate with A-VO2 diff during peak exercise. However, we cannot exclude the possibility that the HFpEF scores predict worsening physical condition, and further study is required to elucidate the supposition using other indices, such as clinical facility score.

The association between the H2FPEF score and peak VO2 may raise the question of whether therapies targeting the components of the score could improve aerobic capacity in patients with HFpEF. Obesity may be a promising target given its high prevalence and pathophysiologic significance14. Kitzman and colleagues demonstrated that weight loss induced by caloric restriction or aerobic exercise training improved peak VO2, reduced LV mass and inflammatory markers, and enhanced the quality of life in obese patients with HFpEF23. Bariatric surgery has been demonstrated to improve functional capacity in obese patients with HF and a reduced EF24. Obesity and increased adiposity may better respond to sodium-glucose co-transporter 2 inhibitors by reducing the plasma volume and visceral and epicardial fat. A substantial proportion of patients with HFpEF develop AF, and they may experience biatrial dysfunction, a poor functional capacity, RV dysfunction, and an increased risk of death12,25,26. Catheter ablation may be effective to reverse or at least prevent the adverse consequences of AF27,28 however, this should be tested in prospective trials. Intensive treatment of isolated hypertension was shown to be effective for the prevention of the development of HF29. Further studies are required to test whether it will prevent or mitigate the progression of HFpEF.

The current study has several limitations. All participants were referred for exercise stress echocardiography. This might introduce selection bias. The sample size was relatively small, which could cause bias in the overall results. The control group was not standard as they were referred for exercise stress echocardiography for the evaluation of exertional dyspnea and had multiple comorbidities, which could also cause bias. Given the presence of exertional dyspnea and comorbidity burden, control subjects may be considered as pre-HFpEF, and the inclusion of controls in the overall analyses might add great insight into the continuous relationships between the magnitude of HFpEF algorithms, exercise intolerance, and cardiac abnormalities across the spectrum from risk to frank HFpEF. Three different definitions were used for elevations in LV filling pressure in the HFpEF diagnostic criteria (invasively measured pulmonary capillary wedge pressure [PCWP] at rest and/or with exercise, the ASE/EACVI criteria, and exercise E/e′ ratio). There might be heterogeneity across the different definitions, which could cause bias in the results30. Most of the participants fulfilled Step 1 of the HFA-PEFF score (Initial workup); however, the last two steps proposed by the HFA-PEFF score (Step 3 [Advanced workup using exercise testing] and Step 4 [Etiological workup]) were not considered in this study. LV longitudinal strain was determined using apical four-chamber views. We did not have data on anaerobic threshold, which precluded assessment of the relationship between the HFpEF scores and worsening physical condition.

Conclusions

Both H2FPEF and HFA-PEFF scores were associated with a lower LV systolic and diastolic function, higher estimated LV filling pressure, lower CO, and reduced RV systolic function during peak ergometry exercise. Despite the requirement of only a few input variables, the H2FPEF score provided a good discriminatory ability for identifying poor exercise capacity among patients with dyspnea; however, the HFA-PEFF score did not. These data may provide new insights into the role of the H2FPEF and HFA-PEFF scores for predicting exercise intolerance and abnormal hemodynamics in patients presenting with exertional dyspnea.

Methods

Study population

This was a retrospective cross-sectional study to determine the accuracy of the HFpEF scores for predicting exercise capacity and echocardiographic findings of hemodynamics during exercise. Consecutive patients who were referred to the echocardiographic laboratory of the Gunma University Hospital for exercise stress echocardiography due to exertional dyspnea between September 2019 and July 2021 were enrolled. HFpEF was defined by typical clinical symptoms (dyspnea and fatigue), normal LV EF (> 50%), and objective evidence of elevated left heart filling pressures at rest and/or with exercise (at least one of the following: the ASE/EACVI-recommended echocardiographic diastolic dysfunction; E/e′ during exercise > 15; or invasively-measured PCWP at rest > 15 mmHg and/or with supine ergometry exercise ≥ 25 mmHg)5,31,32.

Control subjects who were also referred for exercise echocardiography because of clinical indication of exertional dyspnea were also included (controls). The control subjects were required to have no evidence of HF (criteria above). Patients with EF < 50%, significant left-sided valvular heart disease (> moderate regurgitation, > mild stenosis), infiltrative, restrictive, or hypertrophic cardiomyopathy, and non-Group II pulmonary artery hypertension or exercise-induced pulmonary hypertension without elevation in the E/e′ ratio were excluded. The study was approved by our Institutional Review Board with the waiver of consent because of its retrospective design (Gunma University Hospital, Clinical Research Review Board), and was performed in accordance with the Declaration of Helsinki. All authors have read and agreed to the manuscript as written.

Assessment of cardiac structure and function

Two-dimensional and Doppler echocardiography was performed by experienced sonographers using a commercially available ultrasound system (Vivid E95, GE Healthcare, Horten, Norway). EF and mitral s′ velocity were measured to assess LV systolic function. LV deformation analyses were also performed offline with the commercially available software (EchoPAC, GE, Milwaukee, Wisconsin) to measure LV longitudinal strain. An apical four-chamber view was used to calculate the LV longitudinal strain. E, e′, and the average E/e′ ratio were used to assess LV diastolic function31. Stroke volume was determined from the LV outflow dimension and the pulse Doppler profile, and CO was then calculated from the product of heart rate and stroke volume. PASP was calculated as 4 × (peak tricuspid regurgitation [TR] velocity)2 + estimated RAP. RV systolic function was assessed using systolic tissue velocities at the lateral tricuspid annulus.

All subjects underwent supine cycle ergometry echocardiography, starting at 20 W for 5 min, increasing 20 W increments in 3-min stages to subject-reported exhaustion as previously described33. Echocardiographic images were obtained at baseline and during all the stages of exercise. During exercise, mitral annular tissue velocities were measured at the septal annulus. All the Doppler measurements represent the mean of ≥ 3 beats. All studies were interpreted offline and in a blinded fashion by a single investigator (M.O.). In a subset of participants, an expired gas analysis was performed simultaneously with echocardiography at rest and throughout the exercise to measure breath-by-breath VO2. In the current study, impaired exercise capacity was defined by peak VO2 of < 14 mL/min/kg based on a previous study34. The A-VO2 diff was calculated by using the Fick equation (VO2 divided by CO)2.

Calculation of the H2FPEF and HFA-PEFF scores

The H2FPEF score is based on four clinical parameters (BMI > 30 kg/m2 [2 points], treatment with two or more antihypertensive medicines [1 point], AF [3 points], and age > 60 years [1 point]) and two echocardiographic variables (E/e′ ratio > 9 [1 point] and PASP > 35 mmHg [1 point])5. This results in a categorical H2FPEF score ranging from 0 to 95. The H2FPEF scores of 0–1 are associated with a low probability of HFpEF (< 25%) and the score of 6–9 is associated with a high probability of HFpEF (> 90%).

The assessment of the HFA-PEFF score was limited to Step 2 of the algorithm6. The score was calculated as the sum of echocardiographic functional (age-specific cut-offs for e′ velocity, E/e′ ratio, TR velocity, and longitudinal strain: maximum 2 points) and the morphological domains (rhythm-specific LA volume, relative wall thickness, and sex-specific measures of LV mass: maximum 2 points) and natriuretic peptide domains (maximum 2 points). Patients with a total score of 0 to 1 are considered to have a low probability of HFpEF, 2 to 4 as intermediate, and 5 to 6 as high probability. The HFA-PEFF score was calculated if all three domains were available.

Statistical analysis

Data are reported as mean (SD), median (IQR), or number (%) unless otherwise specified. Between-group differences were compared using the unpaired t-test, Wilcoxon rank-sum test, or chi-square test, as appropriate. Pearson’s (normally distributed data) or Spearman’s correlation coefficients (non-normally distributed data) were used to assess relationships between two variables of interest, as appropriate. Multiple linear regression models were then used to adjust for the age, sex, BMI, and presence of hypertension. Meng’s test was used to determine whether there was a significant difference in the strength of the correlation coefficient. Receiver operating curves were constructed to evaluate the performance of HFpEF diagnostic schemes for predicting reduce exercise capacity (peak VO2 < 14 mL/min/kg)34. All tests were 2-sided, with a value of p < 0.05 considered significant. All statistical analyses were performed with JMP 14.0.0 (SAS Institute, Cary, NC, USA).