The 6-minute walk test is a widely used measure of exercise tolerance and a predictor of patient-centred outcomes. In patients with cardiovascular disease, including valve disease, current guidelines advise considering exercise capacity for diagnostics and treatment planning1,2. Wrist-worn devices are constantly improving and have become available to large parts of the population. Today’s sensors typically include mechanical and optical methods to measure activity and heart rate that provide information on individual exercise intensities and gross energy expenditure.

Previous studies have identified wrist-worn devices, accelerometers and pedometers as effective tools to increase patients' daily activity3,4 and have explored the associations between physical activity, cardiovascular events and risk factors5,6,7,8. Whereas many devices are designed to record activity, it has not been studied if wrist-worn devices can predict 6-minute walk tests to accurately assess exercise capacity and enable comparisons between patients.

The 6-minute walk test has been clinically validated and has been used to determine the effects of therapeutic interventions9,10 and prognosis11,12. Although standardized medical exercise tests such as 6-minute walk tests are easy to perform, they still require visiting healthcare services. Wrist-worn devices could offer the advantage of broad availability and may allow performing measurements at home and during everyday activity. Additionally, wearable devices can provide continuous monitoring which enables trends to be identified, making it easier to distinguish the deteriorating patient from the patient that is doing well.

We, therefore, aimed to analyse if 6-minute walk test results can be predicted by heart rate-based activity profiles obtained from wrist-worn devices in combination with literature data in patients with valvular heart disease.


Baseline characteristics and differences between centres

In total, N = 123 datasets from 91 patients with mitral or aortic valve disease were acquired between March 2017 and October 2018. Of those, n = 9 datasets were excluded due to tachycardic atrial fibrillation, and n = 7 due to unavailability of 6-minute walk test data, resulting in a total of n = 107 included datasets and 1,019,748 min of recordings from 84 patients at both sites (Fig. 1). Accordingly, the average recording time of one dataset was 159 h. A total of n = 23 patients contributed a second dataset at the time of a 6-month clinical follow-up after undergoing a valve replacement procedure. As patient characteristics had changed, these patients are represented by two datasets. Baseline characteristics of included datasets are shown in Table 1. Disease severity according to classification standards was mild in 70 (65.4%), moderate in 18 (16.8%) and severe in 19 (17.8%) patients. From all heart rate-based activity levels, time was mostly spent in light activity (51.1%, interquartile range (IQR) 44.8–55.9%). Median 6-minute walk distances were 517 m (IQR 409–581 m). The median percentages of achieved target 6-minute walk distances were 97% (IQR 83.69–109.78%). Targets are patient-specific based on patient age, gender and body mass index (BMI). Parameters of activity are shown in Table 2.

Fig. 1: STROBE (Strengthening the Reporting of Observational Studies in Epidemiology Initiative) flow diagram.
figure 1

Overview illustrating the selection of datasets and reasons for exclusion.

Table 1 Baseline characteristics.
Table 2 Parameters of activity.

Of all included datasets, 55 originated from German Heart Centre Berlin and 52 from Sheffield Teaching Hospitals. Patients from both study centres did not differ in daily time spent in different levels of activity and disease severity. Patients from the German Heart Centre Berlin were younger (62.15 vs. 68.38 years, p = 0.012), had lower BMIs (26.39 vs. 28.19 kg/m2, p = 0.018), higher number of steps per day (29,883.98 vs. 24,931.41 steps, p = 0.01), higher 6-minute walk distances (554.27 vs. 420.13 m, p < 0.001) and achieved higher percentages of their target 6-minute walk distances in meters (103.68% vs. 87.16%, p < 0.001).

Correlation between heart rate- and motion sensor-based activity counts

Initially, associations between pairs of activity measures were evaluated. Correlations were found between heart rate-based combined daily time spent in light/moderate activity and steps (R2 = 0.422, p < 0.001, Fig. 2d) as well as activity counts (R2 = 0.539, p < 0.001, Fig. 2c), both obtained from motion sensor data of the device. There were also correlations between heart rate-based daily time spent in moderate activity alone and step counts (R2 = 0.08, p = 0.003, Fig. 2b) as well as activity counts (R2 = 0.087, p = 0.002, Fig. 2a).

Fig. 2: Relationship between steps and activity counts, recorded by the device at different heart rate-based activity levels.
figure 2

a Daily activity counts measured by the wrist-worn device plotted against daily time spent in moderate activity (R2 = 0.087, p = 0.002). b Daily steps plotted against daily time spent in moderate activity (R2 = 0.08, p = 0.003). c Daily activity counts plotted against time spent in combined light and moderate activity (R2 = 0.539, p < 0.001). d Daily steps plotted against time spent in combined light and moderate activity (R2 = 0.422, p < 0.001). AU arbitrary units.

The percentage of time spent in moderate activity correlated with the absolute 6-minute walk distances (R2 = 0.059, p = 0.012, Fig. 3a). No correlations were found between combined time spent in light/moderate activity and the 6-minute walk distances (R2 = 0.005, p = 0.472, Fig. 3b).

Fig. 3: Relationship between 6-minute walk distances and the percent time spent at different heart rate-based activity levels (without consideration of anthropometric and demographic parameters).
figure 3

a Six-minute walk distances in meters plotted against daily time spent in moderate activity (R2 = 0.059, p = 0.012). b Six-minute walk distances in meters plotted against time spent in combined light and moderate activity (R2 = 0.005, p = 0.472). m meters.

For the correlations shown in Fig. 3, other patient-specific data (anthropometrics, demographics) were not considered, and thus they only assess the correlation between pairs of variables.

Prediction of 6-minute walk test outcomes

In a logistic regression model, the combination of the time spent in moderate activity, age and the type of disease were predictors for the achievement of patient-specific target 6-minute walk distances (overall model’s p < 0.001 (pseudo-) R2 = 0.16). The odds ratios were 1.54 (95% CI 1.04–2.3, p = 0.037) for percent time spent in moderate activity, 1.05 (95% CI 1.01–1.1, p = 0.007) for each year in age, 0.1 (95% CI 0.02–0.56, p = 0.009) for presence of aortic stenosis and 0.27 (95% 0.09–0.82, p = 0.021) for presence of mitral regurgitation. The area under the ROC curve was 75.9%, with 71% of all cases correctly classified regarding achievement or failure to achieve target 6-minute walk distances (reference distances of healthy individuals), resulting in a sensitivity of 65% and a specificity of 77% for the combined model. The use of beta blockers (95% CI 0.38–2.6, p = 0.985) as well as NYHA classes (NYHA II 95% CI 0.67–6.38, p = 0.204, NYHA III 95% CI 0.67–6.38, p = 0.341, NYHA IV 95% CI 0.05–24.26, p = 0.972) did not show a relevant influence within the model. Moreover, running the model without including information on the percentage of time spent in moderate activity did not result in a valid prediction. The probability of achieving target 6-minute walk distances for different percentages of time spent in moderate activity is shown in Fig. 4.

Fig. 4: Probabilities of achieving target 6-minute walk distances dependent on age.
figure 4

Curves represent different percentages (0, 4 and 8%) of daily time spent in moderate activity.

In a robust regression model the time spent in moderate activity in combination with gender, age and BMI as covariates were able to predict 6-minute walk distances (p < 0.001, R2 = 0.48). Each additional percentage of moderate activity led to an increase in 6-minute walk distance of 10.86 m (95% CI 1.38–20.44, p = 0.027), every additional year of life resulted in a decrease in 6-minute walk distances (−4.92 m, 95% CI −6.71 to −3.12, p < 0.001) and each kg/m2 of BMI in a decrease of −7.42 m (95% CI −12.45 to −2.39, p = 0.004). On average, women achieved 77.2 m (95% CI −121.96 to −32.43, p = 0.001) less than men. Beta blocker therapy was included in the model as a baseline covariate and was without relevant impact on patient-specific 6-minute walk distances (95% CI −32–54 to 42.68, p = 0.79). Compared to the minimally detectable changes (MDC) of the 6-minute walk test at 95% confidence intervals in frail older adults of 28.1 m13, a systematic review reporting the minimal clinically important difference of the test to range from 14 to 30.5 m14, the standard deviations of our model-based predictions were between 12.53 and 49.98 m in men and between 18.72 and 57.83 m in women. According to the time spent in moderate activity, gender, age and BMI, specific predicted 6-minute walk distances in meters (including their uncertainty) are provided in Tables 3 and 4.

Table 3 Predicted 6-minute walk distances and standard deviations in meters for men.
Table 4 Predicted 6-minute walk distances and standard deviations in meters for women.


In this study, we used heart rate monitoring from wearables in combination with literature-based reference data to determine the daily amount of time spent in different levels of activity. The time spent in moderate activity was able to predict outcomes of a 6-minute walk test in patients with valvular heart disease. In combination with information on a patient’s gender, age, BMI and disease type, absolute 6-minute walk test distances as well as the probability of achieving target 6-minute walk distances can be predicted (Fig. 5). Furthermore, the uncertainty of these model-based predictions is demonstrated and overlapped with the minimal detectable changes and the minimal clinically important differences of the 6-minute walk test.

Fig. 5: Graphical summary.
figure 5

Step-by-step presentation of the concept for predicting 6-minute walk distances based on daily recordings from wrist-worn devices in combination with demographic and anthropometric data.

Exercise testing in cardiology can help to distinguish symptomatic patients, provide prognostic information before therapeutic interventions and thus can play an integral role in decision-making processes15. The 6-minute walk test is an inexpensive and feasible method to be performed in the clinical and ambulatory setting. Nevertheless, it is limited to submaximal exercise levels and does not provide information on causes of limiting factors, which has remained a more exclusive domain of ergometric tests16. However, especially with bicycle ergometers, maximal exercise levels may not be achieved due to general exhaustion or fatigue of the quadriceps muscle17. Both the 6-minute walk test and ergometric exercise tests typically require special equipment and trained personnel and are vastly limited in children and patients with frailty. Both methods, furthermore, strongly depend on the patient's motivation.

In aortic stenosis and mitral regurgitation, a decrease of exercise capacity can indicate the onset of symptoms as well as a worsening of the haemodynamic status and it is therefore commonly regarded as an indication for intervention1,2. Its early recognition can be an important determinant for the outcome, as arrhythmia, sudden cardiac death and heart failure can occur when symptomatic patients are left untreated18,19,20,21. Hence, additional ways for an uncomplicated evaluation of exercise capacity are of potential clinical value.

Activity in this study was identified using daily heart rate profiles obtained by a wrist-worn device. Previous studies assessing the accuracy of such devices have found an overall high accuracy for measuring heart rate22,23 as well as steps24,25,26, whereas different intensity levels24 and energy expenditure25,26 could only be determined imprecisely. The combined time spent in light/moderate activity correlates with daily steps (Fig. 2d). However, the results of the present study indicate that solely determining overall physical activity is not sufficient to predict exercise capacity. Only with quantification of the specific time spent in moderate activity, 6-minute walk test outcomes were effectively predicted. In contrast, the combined time spent in light/moderate activity is dominated by light activity. These measures did not show a correlation to the 6-minute walk distances and even when combined with demographic and anthropometric data it was unable to determine these outcomes. In our study cohort, the probability of achieving target 6-minute walk test goals of a reference population increased with age (Fig. 4). This effect may at least in parts be attributed to shorter overall target distances in the elderly in combination with a tendency for less activity. At the same time, older patients of our study cohort performed better than younger patients when compared to clinically used reference populations of the same age16. Nevertheless, patients with a higher percentage of moderate activity performed better within their age group. This also underlines that anthropometric and demographic data alone cannot accurately predict BMI, age and gender-specific 6-minute walk test outcomes in a cohort of patients with heart disease, as individual conditions are not considered. Therefore, the robust regression model combines moderate activity with demographic and anthropometric data to predict individual 6-minute walk distances. High activity occurred only sporadically within the observed patient population and can also include errors due to phases of tachycardia, such as in atrial fibrillation.

Besides identifying arrhythmic events27, a heart rate-based approach for evaluating exercise capacity may have the advantage of enabling continuous surveillance during everyday life, e.g., for telemonitoring. A large amount of data gained from telemonitoring could, in turn, be used to further improve the method and reduce its statistical uncertainty. The advantages of continuous monitoring of exercise capacity may be particularly well illustrated by the example of asymptomatic aortic stenosis, where the timing of intervention is still controversial21,28,29,30. While current guidelines strongly advise for surgical intervention as soon as symptoms occur, in asymptomatic patients it is only recommended in cases of severe stenosis1,2. However, symptoms are prone to subjective interpretation and may not always be perceived by patients in the same way29,30. Additionally, monitoring moderate activity may also be beneficial in settings where physical activity is restricted to avoid symptoms. Koehler et al.31 have recently found that telemedical monitoring of heart failure patients can help to reduce all-cause mortality and hospital days due to unplanned cardiovascular reasons. The continuous monitoring of exercise capacity could additionally help to detect gradual clinical worsening and should, therefore, be further evaluated to better understand the potential benefits of such broadly available information for disease detection and therapy planning.

In cardiopulmonary exercise testing, exercise is usually quantified by an externally set workload. A heart rate-based approach, however, relies on the interpretation of a surrogate where literature evidence is used to identify activity states.

Finally, we acknowledge that this study has certain limitations. To determine resting heart rate, activity sensors data of the device were used. The paroxysmal occurrence of tachycardic atrial fibrillation, especially common in the older population32, may potentially influence the heart rate-based analysis of activity. To reduce the influence of atrial fibrillation on measurements, heart rate profiles have been scanned for tachycardia and if present, were excluded. However, it cannot be certain that datasets with short periods of atrial fibrillation may still be included. Additionally, moderate activity itself can trigger relevant tachycardia. The combined simultaneous use of heart rate sensors, actometers and novel integrated electrocardiographic sensors may help to better detect such events and to distinguish between tachyarrhythmia and activity.

The influence of age, medication and physical fitness on the resting heart rate may be a limitation for categorizing activity based on heart rate data. Therefore, the use of beta blockers was included as a covariate within the models and was shown to be without relevant impact for the achievement of target 6-minute walk distances or absolute distances. Furthermore, the thresholds for activity levels are based on individual resting heart rates. Consequently, for patients with low resting heart rates, e.g., due to beta blockers or high fitness level, the thresholds for activity levels would be correspondingly lower.

Compared to large-scale physical activity data33, the average number of steps per day in our cohort was high and may in parts be attributed to an increased awareness that can influence daily activity. Additionally, the large-scale datasets were acquired from mobile phones, that are not always carried. In contrast, the wrist-worn devices had a high wearing time, which has a direct impact on the number of recorded steps.

When determining exercise capacity based on daily physical activity, it should be considered that externals factors can affect activity. For example, exceptional situations or events occur during measurement periods, and thus daily activity can be influenced, not reflecting the individual exercise capacity. By extending measurement periods and patient counts, this confounder could be further limited. Several other factors including orthopaedic or mental diseases can influence everyday physical activity34,35. However, long-term changes in daily physical activity are commonly known to also result in changes in exercise capacity36. Higher sample sizes may help to further improve the model and reduce variances.

The 6-minute walk test has become a widely used prognostic marker of patient-centred outcomes including death or hospitalization37. Its ability to determine exercise capacity can be influenced by several individual physical and psychological factors, including pain, motivation and co-morbidities. Although it is considered to be a standardized test, its variability can be substantial even in healthy populations. As a goodness-of-fit indicator, R2 values of the robust regression model were approximately 0.5 within our cohort. In line with these findings, an inherently great amount of unexplainable variation can typically be found in exercise testing. In healthy subjects, coefficients of determination (R2) similar to our values have been reported when comparing the expected (equation-based reference values) and the actually achieved walking distances38. Additionally, the uncertainty of the model-based predictions (provided in Tables 3 and 4) overlapped with the minimal clinically important differences of the 6-minute walk test. Hence, we consider the findings in our disease-specific cohort to be clinically meaningful and vastly within the limitations of the 6-minute walk test14, although the amount of unexplained variation in this clinical standard method can be high.

Heart rate-based activity levels and the model’s output were not directly tested against more objective variables of functional capacity tests or morbidity/mortality outcomes. Therefore, further studies are needed to assess, how these measures translate into patient-specific outcomes. Some studies assessing 6-minute walk tests have found an association of results from cardiopulmonary exercise testing and the 6-minute walk test39,40,41, whereas others did not find the 6-minute walk test to be a reliable measure of exercise capacity42. Further studies testing the assessment of exercise capacity, using heart rate data obtained from wrist-worn devices, against cardiopulmonary exercise testing might be needed to support its validity. Furthermore, studies including other patient groups are needed to verify the method's applicability for other diseases. More datasets with a longer recording time could further reduce statistical uncertainty.

In summary, we were able to demonstrate that wrist-worn devices can be of use for predicting a patient's 6-minute walk test outcome, with uncertainties overlapping to the minimal clinically important differences and the MDC of the 6-minute walk test itself. The daily time spent in moderate activity was determined based on heart rate data obtained from these wearable devices. It was predictive for 6-minute walk distances and achievement of patient- (gender, age and BMI) specific target 6-minute walk distances, an established diagnostic and prognostic marker. Further studies in larger cohorts and a variety of disease groups are required to improve the method’s accuracy and to investigate if continuous recordings can provide helpful additional information for diagnostic processes and therapy planning.


Study population

The present study was part of the “EurValve” research initiative, focusing on decision support in patients with valvular heart disease. The project’s aim was to implement and test, in a relevant clinical target cohort, a decision support system (DSS) for aortic and mitral valve replacement and repair. The main component of the DSS was a combined 0D model of the cardiovascular system that includes modification options for valve repair and replacement, aiming to predict the haemodynamic effects of different types of treatment. The multi-centre study was conducted at three European sites in the Netherlands, the United Kingdom and Germany. In all recruited patients, clinical routine data were assessed and included demographics, anthropometrics, medical history and functional cardiac imaging. Patients with aortic stenosis, mitral regurgitation and mixed valvular heart disease recruited at Deutsches Herzzentrum Berlin (DHZB, German Heart Centre Berlin) and the Sheffield Teaching Hospitals were assessed using a Philips health watch (DL8791, Philips, Stamford, CT, USA). The device is licensed as a medical product in Europe. Heart rate as well as steps and activity counts from wrist movement was recorded.

The study protocol included wearing the device during everyday life for at least 23 h and performing a 6-minute walk test. Datasets were excluded if the heart rate-based analysis of daily activity was affected by tachycardic atrial fibrillation or if no 6-minute walk test was conducted. Patients were included at different stages and time points during the treatment process (before and/or after treatment).

The primary comparison was between the amount of time spent at different levels of activity each day, based on heart rate and 6-minute walk distances. Secondarily, daily activity was compared to the achievement of target 6-minute walk distances. All procedures followed the ethical guidelines of the 1975 Declaration of Helsinki and were approved by the local Ethical Committee at both sites (Ethikkomission Charité—Universtätsmedizin Berlin: EA2/093/16, NHS Health Research Authority: 17/LO/0283). Written informed consent was obtained from all included patients. Due to ethical regulations, untreated patients with severe aortic stenosis were not included at German Heart Centre Berlin. The study complied with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement. The study has been registered on (NCT04068740).

Six-minute walk test

The six-minute walk test was conducted before providing the patient with a device and was performed according to current guidelines16 shortly before the device recordings. The patient-specific target distance depending on gender, age and BMI was calculated using the established formula “6-minute walk distance = 1140 m − (5.61 × BMI) − (6.94 × age)” for men and “6-minute walk distance = 1017 m − (6.24 × BMI) − (5.83 × age)” for women43.

Heart rate-based measurements of physical activity

The average heart rate as well as the total number of steps and activity counts were computed for each minute. Activity is a measure of accelerometery counts data obtained from the wrist-worn device. All data were analysed by an automated software tool to identify time spent at different levels of activity. Sleep and rest were determined based on steps and activity counts during recordings. Rest was defined as one or no steps per minute and an activity count below 475 arbitrary units (AU) per minute; sleep as an activity count below 70 AU.

The average heart rate during time spent resting while wearing the device was defined as the resting heart rate. All activity levels were then determined by changes in heart rate. Based on a meta-analysis of physiologic heart rate changes during different levels of activity44 and the identified resting heart rate, the level of activity was determined for each data point. An increase of the heart rate between 31 and 49 b.p.m. above resting heart rate was defined as light, between 50 and 88 b.p.m. as moderate and above 88 b.p.m. as high activity44.

Statistical analysis

Continuous data are expressed as median and interquartile range (IQR, Q1–Q3) unless stated otherwise. Categorical data are presented as frequencies and percentages (%). Data distribution was tested using Shapiro–Wilk and Shapiro–Francia tests. A logistic regression model with continuous and binary patient-specific covariates (age, presence of aortic valve disease and presence of mitral valve disease, use of beta blockers) was used to evaluate the predictive quality of the percent time spent in moderate activity on binary outcome measures (achievement of target 6-minute walk distances). Additional to patient-specific anthropometric and demographic characteristics, both disease conditions were included as binary covariates to control for possible unobserved variables within the disease groups. The rationale for including these conditions in a single model instead of testing in separate models was the concept of general applicability of the method across different groups of valvular heart disease. Robust regression was used to assess multifactorial effects on 6-minute walk distances with consideration of outliers. Based on the robust regression, R2 values (as a goodness-of-fit measure) and predictive margins with standard deviations were calculated and plotted to visualize the combined effects of age, gender and BMI. Stata version 15.1 was used for statistical analysis. Interpretation of our findings followed the advice by the American Statistical Association45. MFPIGEN package was used to investigate the interaction between each pair of covariates and both predictive models were built without interactions.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.