Step detection and energy expenditure at different speeds by three accelerometers in a controlled environment

Physical activity (PA) is one of the most efficient ways to prevent obesity and its associated diseases worldwide. In the USA, less than 10% of the adult population were able to meet the PA recommendations when accelerometers were used to assess PA habituation. Accelerometers significantly differ from each other in step recognition and do not reveal raw data. The aim of our study was to compare a novel accelerometer, Sartorio Xelometer, which enables to gather raw data, with existing accelerometers ActiGraph GT3X+ and activPAL in terms of step detection and energy expenditure estimation accuracy. 53 healthy subjects were divided into 2 cohorts (cohort 1 optimization; cohort 2 validation) and wore 3 accelerometers and performed an exercise routine consisting of the following speeds: 1.5, 3, 4.5, 9 and 10.5 km/h (6 km/h for 2nd cohort included). Data from optimization cohort was used to optimize Sartorio step detection algorithm. Actual taken steps were recorded with a video camera and energy expenditure (EE) was measured. To observe the similarity between video and accelerometer step counts, paired samples t test and intraclass correlation were used separately for step counts in different speeds and for total counts as well as EE estimations. In speeds of 1.5, 3, 4.5, 6, 9 and 10.5 km/h mean absolute percentage error (MAPE) % were 8.1, 3.5, 4.3, 4.2, 3.1 and 7.8 for the Xelometer, respectively (after optimization). For ActiGraph GT3X+ the MAPE-% were 96.93 (87.4), 34.69 (23.1), 2.13 (2.3), 1.96 (2.6) and 2.99 (3.8), respectively and for activPAL 6.55 (5.6), 1.59 (0.6), 0.81 (1.1), 10.60 (10.3) and 15.76 (13.8), respectively. Significant intraclass correlations were observed with Xelometer estimates and actual steps in all speeds. Xelometer estimated the EE with a MAPE-% of 30.3, activPAL and ActiGraph GT3X+ with MAPE percentages of 20.5 and 24.3, respectively. The Xelometer is a valid device for assessing step counts at different gait speeds. MAPE is different at different speeds, which is of importance when assessing the PA in obese subjects and elderly. EE estimates of all three devices were found to be inaccurate when compared with indirect calorimetry.


Results
Step detection. Three accelerometers were used to estimate the number of steps taken and each of them were individually compared with the counted steps in the video camera-recordings in the 5 different speeds (6 in cohort 2) to determine the accuracy of different methods (Table 1, 2). Two cohorts with similar anthropometrics were measured. The data from the optimization cohort (cohort 1) were used to optimize the step detection algorithm and tested in cohort 2. The mean absolute percentage error (MAPE) percentages observed in steps detection resulted from the underestimation of step counts in all three devices. Bland-Altman plots were constructed to visualize the relationship between the mean and difference of actual and estimated steps in each Table 1. 35 healthy subjects (optimization cohort 1) participated in the study and performed a 20-min exercise routine with 3 walking (1.5, 3 and 4.5 km/h) and 2 running (9 and 10.5 km/h) speeds. Each speed was recorded for 4 min. The performance was recorded with a video camera and steps calculated afterwards for each speed. Mean absolute percentage error (MAPE) percentages, paired sample t test statistics and intraclass correlation (ICC) statistics with 95% CI presented for each device in each speed and for total sum of steps. *Shows statistical significance.  Table 1). R 2 between the actual and device-estimated steps was 0.824 when considering all the different speeds individually (Fig. 1A). The intraclass correlations were significant in speeds between 3 and 9 km/h with good or excellent correlation coefficients (ICC > 0.75 and > 0.90, respectively). The thigh-worn activPAL performed most accurately in the walking speeds of 1.5, 3 and 4.5 km/h speeds, with the MAPE percentages 6.6, 1.6 and 0.8, respectively (Table 1). In the walking speeds, a significant difference between the observed and estimated step counts was detected in 1.5 km/h speed but not with 3 or 4.5 km/h speeds (p = 0.001, 0.194 and 0.078, respectively). While running (9 and 10.5 km/h), MAPE percentages were 10.6 and 15.8, respectively. Statistical differences between actual and meter-estimated steps were found in both running speeds. When all speeds evaluated together, MAPE percentage was 7.9 with a significant difference with the video recorded step counts. Regression analysis resulted in R 2 of 0.836 when all speeds were individually considered (Fig. 1B). Statistically significant intraclass correlations were observed in all speeds except in 9 km/h. The mean difference in the Bland-Altman plot (Supplementary Figure 2 Table 1).
The ActiGraph GT3X+ performed well in higher exercise speeds (4.5, 9 and 10.5 km/h) with mean absolute error (MAPE) percentages of 2.1, 2.9 and 3.0, respectively. In the paired samples t test a significant difference between video-camera observed and ActiGraph-estimated steps was not observed in the running speeds (9 and 10.5 km/h, p = 0.802 and 0.723, respectively). In walking speeds (1.5 and 3 km/h), the GT3X+ was less accurate with MAPE percentages of 96.9 and 34.7, respectively and a significant difference was observed in step counts between the methods (Table 1). There was a significant difference between the observation methods in the total steps taken during the 20-min period with MAPE percentage of 16.7. Significant intraclass correlations were observed in 4.5, 9, 10.5 km/h and total steps taken. The R 2 for the regression between individual actual and estimated steps was 0.925 (Fig. 1C). For the Bland-Altman plot (Supplementary Figure 3) (Table 3). For running (9 and 10.5 km/h), MAPE %'s were 29.4 and 18.4, respectively. The R 2 between IC METs and accelerometer METs was 0.910 (Fig. 2). Significant differences were observed in all speeds between the accelerometer-estimated and IC EE. No significant intraclass correlations were found.

Validation cohort 2.
After the optimization of the Sartorio step detecting algorithm, a validation cohort 2 was investigated (Table 4). At 1.5 km/h MAPE-% was 8.1 with a significant difference in step numbers between  (Table 5). At the additional walking speeds (3, 4.5 and 6 km/h) the MAPE-%s were 3.5, 4.3 and 4.2, respectively. No significant differences between direct observation and estimated step counts were detected. At running speeds (9 and 10.5 km/h), the MAPE-%'s were 3 and 4.8, respectively. The MAPE-% for total 24-min step count was 2.7. At 10.5 km/h and total step counts, significant differences were observed between the direct observation and estimated steps. Significant intraclass correlations were detected in all speeds and the total step numbers, with correlation coefficients between 0.62 and 0.99. The R 2 for the regression between the direct observation and estimated steps was 0.965 (Fig. 1D). In the Bland-Altman plot (Supplementary Figure 5) Table 3).
In the validation cohort, the results for the activPAL and ActiGraph were similar to the optimization cohort ( Fig. 1E,F). At 6 km/h, both devices had low MAPE-%s of 1.2 and 3.1, respectively and no significant differences between estimated and directly observed steps were found ( Table 5). The R 2 's of the regressions between direct observation and estimated steps were 0.88 and 0.86, respectively.

Discussion
We compared the novel Xelometer for step detection in a controlled environment at different speeds and compared it to the two most commonly used accelerometers, ActiGraph GT3X+ and activPAL. Since the step detection of Sartorio Xelometer was found to be inaccurate in slow walking (1.5 km/h) and running speeds (9, 10.5 km/h), we decided to optimize the algorithm with machine learning by using the data from the optimization cohort 1 and to test the optimized algorithm in the validation cohort 2. The protocol in optimization cohort 1 included walking (1.5, 3 and 4.5 km/h or 25, 50 and 75 m/min) and running (9 and 10.5 km/h or 150 and 175 m/ min). In addition, the speed of 6 km/h (or 100 m/min) was included in the protocol of the validation cohort 2. Our special interest was on slow walking speeds (1.5 and 3 km/h) since in the elderly and obese population these are the most frequent walking speeds. The optimization of the Sartorio step detection algorithm improved the accuracy on slow walking speed (1.5 km/h) and while running (9 and 10.5 km/h). There was reduction of the correlation coefficients in speeds of 1.5-4.5 km/h but the correlations became significant. All three accelerometers had their optimal speed range and the accuracy varied between devices. After optimizing the step detection algorithm, the Xelometer had the best overall performance in the validation cohort 2 with the total count MAPE percentage of 2.7, compared to ActiGraphs and activPALs 14.1 and 5.6, respectively. The most accurately Xelometer performed at speeds of 3 and 9 km/h. In walking speeds, activPAL was the most accurate device before Sartorio, while ActiGraph GT3X+ did not detect most of the steps. At running speed, the ActiGraph GTX3+ performed the most accurately before Sartorio and activPAL. The intraclass correlations support these findings showing good and significant (> 0.75) or excellent (> 0.90) correlation coefficients for activPAL in all speeds except 10.5 km/h, ActiGraph GTX3+ while running and walking at or over 4.5 km/. For the optimized Sartorio Xelerometer, the correlations were significant in all speeds with good or excellent correlations for the overall step number and in speeds of 4.5-10.5 km/h. Importantly, the Xelometer's performance was the most stable throughout the protocol with all MAPE-%'s 3.1-4.3, except the 8.1 for 1.5 km/h. The MAPE%s observed in this study for activPAL and GT3X+ are similar for those published by Feito and colleagues in 2012, where they showed that activPAL error-% are low in speeds of 2.4-5.6 km/h, while GT3X+ becomes accurate in detecting steps with speeds higher than 4 km/h (67 m/min) and significantly underestimates the step counts at lower speeds 15 . At a speed of 3.2 km/h, a 40% error has been reported for the GT3X+, which then diminishes with increasing speed 22 . We did not use ActiGraph's low frequency extension, since discrepancies have been reported while applying it in overall step detection in free-living conditions 23,24 . Our findings are also in line with Ryan and colleagues, who showed accurate step detection for activPAL in speeds between 3.24 and 6.4 km/h 25 . When selecting a suitable method for PA measurement, the properties of the accelerometers should Table 5. 19 healthy subjects (validation cohort 2) participated in the study and performed a 24-min exercise routine with 4 walking (1.5, 3, 4.5 and 6 km/h) and 2 running (9 and 10.5 km/h) speeds. Each speed was recorded for 4 min. The performance was recorded with a video camera and steps calculated afterwards for each speed. Mean absolute percentage error (MAPE) percentages, paired sample t test statistics and intraclass correlation (ICC) statistics with 95% CI presented for each device in each speed and for total sum of steps. *Shows statistical significance. www.nature.com/scientificreports/ be considered. If the main interest is in studying older or more sedentary populations, a device more accurate at the lower spectrum of locomotion would be needed. Based on these results, the Xelometer's step detection is a valid method to observe subject's PA volume. In comparison with the two commonly used and validated accelerometers ActiGraph GT3X+ and activPAL, the Xelometer performed equally throughout the protocol (MAPE% 2.7, ICC 0.99) and thus suitable specially to observe the PA level.
All devices performed poorly estimating EE. activPAL estimated the EE closest with a MAPE of 20.5.2%, while Sartorio Xelometer and ActiGraph GT3X+ showed more inaccurate estimations with MAPE percentages of 30.3 and 24.3, respectively. Standard deviations of 9.6-16.4% support the inaccuracy statement. Moreover, ICCs showed no significant correlations with IC measurements in any of the three devices. Calculations of Sartorio Xelometer's individual EE estimates for each speed showed higher inaccuracies on lower speeds with enhanced accuracy with higher speeds. Both activPAL and ActiGraph GT3X+ underestimated the EE, while an overestimation was observed for Sartorio Xelometer EE. Our error percentages are similar of those reported by Calabró and colleagues with an underestimation of 22.2% and 25.5% for activPAL and ActiGraph GT3X+, respectively 26 . Similar error percentages for ActiGraph GT3X+ but not for activPAL were reported (21.2 and 9.3, respectively) by Alberto and colleagues in light-intensity PA 27 . These results suggest that accelerometer EE estimates are inaccurate and their use in research should be carefully considered.
In our study we need to consider limitations. Our subjects were healthy volunteers with a BMI less than 27 (Optimization cohort 1 23.0 ± 2.5, Validation cohort 2 24.6 ± 3.24). The results observed in this population may not be translatable to a cohort with different anthropometrics and PA capabilities. We also examined the function of the devices in a controlled environment. The strengths of this study include video camera-recorded true steps, both sexes as subjects and the use of two already validated accelerometers.

Materials and methods
We recruited healthy 54 subjects amongst the students and faculty members of University of Oulu and other City of Oulu institutions ( Table 4). The subjects were divided into two cohorts (35 subjects for the optimization cohort 1 and 19 subjects for the validation cohort 2) based on the order of their sign-up. Validation cohort 2 was recruited after the analysis of optimization cohort 1. The study was approved by the ethical committee of the Northern Ostrobothnia Hospital District and was executed in line with the National legislation, guidelines and the Declaration of Helsinki. Written informed consent was given by subjects in accordance with the Declaration of Helsinki.
Subjects in the optimization cohort (n = 35) were asked to fast and retain from strenuous exercise, coffee and nicotine at least 14 h before the study visit. Height and weight were measured in centimeters with one decimal accuracy. BMI was calculated as weight (kg) per height (m) squared. Body composition was determined using bioimpedance with InBody 720. Oxygen uptake and carbon dioxide production were recorded by indirect calorimetry (IC) using Medikro model 909 Ergospirometer [28][29][30] . The device was calibrated before every subject for volume and gas concentrations. For gas calibration a mixture of oxygen (15%), carbon dioxide (5%) and nitrogen (80%) was used. The subjects had an overnight fast. Resting metabolic rate (RMR) was recorded in a supine position until the levels plateaued at least for 10 min. Last 5 min of the measurement were used to calculate basal metabolic rate (RMR). Respiratory exchange ratio (RER) was monitored to stay within 0.7 and 0.99 during the RMR measurement. Metabolic rate was calculated using the Weir equation: metabolic rate (kcal/day) = 1.44 (3.94 VO 2 + 1.11 VCO 2 ). Since RMR was measured in a supine position and not while sitting, a conversion factor of 7% was used to correct RMR values for postural effect according to Newton et al. 31 . Corrected RMR was used as a level of 1 metabolic equivalent (MET) for further IC analysis for calculating the reference values for accelerometer estimates.
After the baseline measurements subjects were asked to perform an exercise routine of 20 min on a treadmill (OJK 2, Telineyhtymä, Kotka, Finland), which consisted of five 4-min walking and running periods. The speeds were 1.5, 3, 4.5, 9 and 10.5 km/h, respectively. The lower speeds were chosen to correspond the walking speeds in daily behavior of older and diseased subjects. People over 60-years move at a speed of 4.2 km/h on average and prediabetics subjects have been shown to move mainly on speeds between 2-3 km/h 10,32 . The acceleration to the next speed took 5-10 s and was included in the beginning of each 4-min period. A video camera was set up and recorded the subject's moving feet during the exercise. These videos were used to calculate the actual step counts. Two persons counted the steps from the video 33 . This method of direct observation was chosen to reduce observational error. Oxygen consumption and carbon dioxide production were monitored throughout the exercise protocol to calculate the PA energy expenditure. The total EE for the 20-min protocol was calculated by assessing the EE with Weir equation for every minute and adding them together.
Subjects wore three different accelerometers on their body during the exercise. ActiGraph GT3X+ and Sartorio Xelometer were worn on elastic belt on the hip on the right side of the body. Sartorio Xelometer (Supplementary Figure 4) is a novel tri-axial accelerometer with a raw acceleration data output (g) with a ± 8 g range, 0.0156 g resolution, 100 Hz sampling rate and a battery life of 21 consecutive days of measurement. No data processing takes place within the device and all data processing is done in MATLAB R2019a software. ActivPAL was worn on the left thigh. Manufacturer's software was used to set up the device and determine step counts and EE estimates. For the EE estimation, activPAL uses the following equation: MET × h −1 = (1.4 × d) + (4-14) × (c/120) × d, where c = cadence (steps per minute) and d = activity duration (in hours) 34 . ActiGraph GT3X+ data was extracted with ActiLife v6.13.4 and step counts were determined using 1 s epochs and 100 Hz sampling rate to accurately define the counts for different speeds. For energy expenditure (METs), Freedson Adult (1998) cut points were used within the software (equation: MET Rate = 1.439008 + (0.000795 × CPM) where CPM = Counts per Minute) 35  . MET-hours from activPAL were transformed into METs for comparisons. Only total EE for the recordings are available from ActiGraph's ActiLife and activPAL's s software, thus preventing comparison on 4-min interval level with the IC. Sartorio Xelometer data was extracted using Sartorio v18 software and detection algorithms provided by the manufacturer were run on MATLAB R2019a for step counts, step intensities and EE estimates (MET). The step detection algorithm creates a 3d-vector based on the recordings from the three axes (Fig. 3). When 3d-vector signal exceeds the certain threshold, this exceeding area is analyzed based on the peak amplitude, rising and declining time. The presented data have not been used in the development in the Sartorio step detection algorithm 1 or EE estimation algorithm or the software. The EE estimation algorithm for the Xelometer is based on the signal vector and results of Vähä-Ypyä et al. 36 , following the equation: VO 2 = 7.920-0.0331*MAD (mg). MAD (mean amplitude deviation) was calculated with the following equation: where N is the number of samples in the epoch, j is the start of the epoch and R ave is the mean resultant value. The conversion of VO 2 from the Sartorio EE estimation equation to MET was done using the standard conversion factor (1 MET = 3.5 ml*kg −1 *min −1 ). Data from the optimization cohort 1 were used to optimize the Sartorio Xelometer step detection algorithm. The step detection program was developed using a machine learning algorithm. The algorithm applied supervised learning and used the optimization cohort 1 data as the training data. The data included exact step counts obtained by the video analysis with four different walking/running speeds and a recorded 3d acceleration signal. These speeds were 1.5, 3, 9 and 10.5 km/h, respectively. The algorithm was developed using MathWorks Matlab R2019a. The machine learning algorithm applied following parameters related to the norm of the 3d acceleration signal: (I.) a threshold value for the 3d acceleration (the acceleration value needs be higher than the threshold to be accepted as the starting point of an acceleration peak to be analyzed). (II.) The maximum value of the 3d acceleration peak. (III.) The slope of the 3d acceleration peak. (IV.) The area of the 3d acceleration peak. (V.) The time difference between consecutive 3d acceleration peaks. The algorithm tested different threshold values for the given parameters. If the accelerations were acting in the predetermined ways (for example, an acceleration value should be higher than the corresponding threshold, the time difference shorter than the corresponding threshold etc.) related to the tested parameters they were accepted as step counts and the relative error comparing to the results obtained by the video analysis were calculated. Those parameter values that yielded the lowest relative errors were chosen to the final step count program (Supplementary Table 4).
Subjects in the validation cohort 2 (n = 19) completed a similar set of measurements with following exceptions. RMR and energy consumption were not measured since the optimization of the algorithm does not affect the EE estimation. The brisk walking speed of 6 km/h was added into the protocol.
MAPE percentages were calculated in all speeds between the actual (video) and accelerometer-estimated step counts with the following equation: www.nature.com/scientificreports/ MAPE percentages of over 5 were considered as relevant disagreement 15,37 . For EE estimates only total EE for 20 min was used in the assessment. Data from the accelerometers and IC were transformed into METs for comparisons. To observe the similarity between video and accelerometer step counts, paired samples t test, linear regression and intraclass correlation were calculated and Bland-Altman plots drawn for step counts in different speeds. All statistical analysis was done, and figures generated using IBM SPSS Statistics version 26. P-values less than 0.05 were considered statistically significant. ICC over 0.90 was considered excellent, 0.75-0.90 good, 0.75-0.60 moderate and less than 0.60 as low. Results in the Tables 4 and 5 are represented as mean ± standard deviation.

Conclusions
The Xelometer is a valid device for assessing step counts at different gait speeds. Its accuracy is comparable to widely used activPAL and different than ActiGraph GT3X+. The EE estimates of all three devices were inaccurate when compared with indirect calorimetry.