Introduction

Large epidemiologic studies have established the health-promoting effects of self-reported physical activity on numerous disease endpoints, including cardiovascular disease, hypertension, type 2 diabetes, various cancers, and premature mortality1. However, self-report physical activity methods are fraught with numerous limitations, including susceptibility to systematic errors like social desirability reporting and recall issues, which can lead to potentially biased estimates2,3. By comparison, device-based measurement methods such as accelerometers provide objective estimates of physical activity4. Moreover, advancements in technology have made it possible to record and store triaxial raw 24-h accelerometry data over extended periods, spanning days or weeks, in large-scale studies involving thousands of participants (e.g., UK Biobank, NHANES, the HUNT4-N Study, the Pelotas Birth Cohorts)5,6,7,8. The major advantage of high-resolution accelerometry data lies in its capacity for transparent and flexible processing, which facilitates comparability and improves harmonization across different studies and devices9,10,11,12. Furthermore, data from wearable sensors enable the use of sophisticated analytical methods, such as deriving physical activity patterns and conducting compositional 24-h activity analyses13,14,15,16,17. The initial and most critical step in effectively utilizing raw accelerometry data involves rigorous quality control and the thorough generation of derived physical activity metrics.

The primary aim of the current manuscript was to describe the methodology for processing raw accelerometry data used in the German National Cohort (NAKO). Our goal was to evaluate the completeness and plausibility of the data and to provide justification for key data processing and analysis decisions. The ultimate goal was to produce and share a comprehensive repository of accelerometry data, which has the potential to address unresolved questions in the field of physical activity epidemiology.

Methods

Study population

Detailed information on the design and aims of the NAKO have been published elsewhere18. In short, 205,415 men and women (50% each) aged 19–74 years were recruited in 18 study regions in Germany at study baseline between 2014 and 201918. The study aims to identify risk and protective factors as well as to provide imaging and biomarkers for major chronic diseases18. The NAKO is performed in accordance with the ethical standards of the institutional and/or national research committee, with national law and with the Declaration of Helsinki of 1975 (in the current, revised version). The study was approved by the responsible local ethics committees of the German Federal States where all study centers were located (Bayerische Landesärztekammer (protocol code 13023, Approval Date: 27 March 2013 and 14 February 2014 (rectification of documents, study protocol, consent form etc.))). Written informed consent was obtained from all participants18,19. This study is reported according to the STROBE (Strengthening the Reporting of Observational studies in Epidemiology) Guidelines/methodology (Supplementary Information 2).

Accelerometer and data collection

ActiGraph (Pensacola, FL, USA) accelerometers have been extensively employed in epidemiologic research20. As a result of the evolution of ActiGraph products throughout the course of the NAKO baseline data collection, three ActiGraph models (GT3X+, wGT3X+, and wGT3X-BT) were employed. All models record acceleration (g (gravitational acceleration) ≈ 9.81 m/s2) tri-axially (in longitudinal, lateral, and anteroposterior direction when positioned on the side of the hip) and were configured at a 100 Hz sampling rate. While GT3X+ and wGT3x+ have a 6 ± g dynamic range, wGT3X-BT has an 8 ± g dynamic range (all models use a 12-bit conversion). Further, firmware versions ranged from v1.2.0 to v3.2.1. To maximize the transparency and reproducibility of data processing, we disabled the default Low-Power-Mode filter to avoid relying on manufacturer-dependent pre-processing steps of the data.

During the study center visit, study personnel attached the device above the right hip on the mid-axillary line using an elastic strap wearable over or underneath clothing. Participants were instructed to wear the device continuously (24/7) and to carry out all activities as usual. The device had to be taken off only in case of water contact lasting longer than 30 min such as in the sauna or while swimming or diving. The recording period started on the first day and ended on the eighth day after the study center visit. On the morning of day nine, the device had to be detached and sent back to the study center using a pre-paid envelope.

Data processing

We used the open-source R package GGIR version 2.10-321,22 combined with the R package read.gt3x version 1.2.023 for data processing. GGIR has been described elsewhere in detail21. All data processing was conducted on the high-performance computing cluster at the University of Regensburg. In the following section we summarize the main processing steps.

Briefly, calibration correction coefficients were derived from non-movement periods in the acceleration data, where an iterative change point algorithm was used to calibrate signals to 1 g24. Measurements with a calibration error greater than 0.02 g were excluded from analysis based on our observation that the distribution of acceleration metrics values, as detailed below, showed greater variation with > 0.02 g calibration error compared with 0.01–0.02 g calibration error.

To empirically verify variation in the X and Y axis order, the longitudinal axis was estimated from the data by calculating the correlation of the epoch-by-epoch angle of each accelerometer axis with a 24-h lagged version of itself. See legend of Fig. S5 for details on angle calculation. As the longitudinal axis shows a clear upright (daytime)–lying (nighttime) pattern, it is expected to show the highest day-by-day correlation.

Non-wear time was detected using a previously described and commonly used procedure21,25,26,27,28. Briefly, non-wear times per 15-min interval were identified using the standard deviation (< 13.0 mg for ActiGraph) and the range of values (< 50.0 mg) of the enclosing 60-min interval25.

Various summary metrics were calculated per 5 s epoch, including the Euclidean norm minus one (ENMO)25 and the Mean Amplitude Deviation (MAD)29. Both metrics have previously been used in physical activity research5,26,27,28,30,31 and a detailed description can be found in Supplementary Methods S1 and S2. In contrast to some other studies32, no low-pass frequency filter was applied because high signal frequencies can contain harmonics of movements at lower frequencies, which is not noise but a true reflection of movement. ENMO and MAD values were aggregated separately per participant across all valid (week(end))-days, per valid day, and on a 15-min level across all valid days. Signal features were imputed for 15-min time segments classified as non-wear time or where more than 80% of the raw data points in the segment had a value close to or at the edges of the accelerometer’s dynamic range, as discussed in more detail in a previous study25. Note that not imputing implies zero movement and omitting the data points implies imputing using the rest of the recording.

To represent the distribution of a participant’s time spent in physical activity intensities, value distributions in 10 mg increments for both the ENMO and MAD metrics were derived. It must be noted that MAD values are known to be higher than ENMO values, therefore time spent in certain acceleration ranges is not directly comparable between ENMO and MAD metrics. The definition of moderate to vigorous-intensity physical activity (MVPA), i.e., physical activity conducted at an intensity of ≥ 3 metabolic equivalents of task (METs)33, does not lend itself to be unambiguously estimated from accelerometry data. However, MVPA is frequently used in physical activity research. Thus, to represent time spent above different physical activity intensity (acceleration) levels, various estimates were derived based on different epoch lengths, acceleration thresholds, and bout duration criteria to provide a variety of choices for data analysis and sensitivity analyses (Supplement Box 1).

In line with literature that has relied on 24-h wear protocols34,35, we considered days with at least 16 valid wear hours as valid. Furthermore, using 16 h (2/3 of the anticipated 24-h wear period) aligns with traditional hip worn accelerometer literature, where accelerometers were typically worn only during waking hours, with an expectation of 10 h of wear out of the approximately 15 waking hours3.

Descriptive and exploratory analyses

We excluded participants with less than one valid weekend day or fewer than two valid weekdays, ensuring coverage of both weekend and weekday activities, or participants with an incomplete 24-h cycle, i.e., participants with consistently invalid data for any 15-min period across all recording days. To support that rationale, we ran missing data simulations in a subsample of 51,998 participants with perfect wear time compliance (seven valid days, five valid weekdays, two valid weekend days). Consecutively, ENMO measures from six to one random day(s) of this sample were averaged and compared against the 7-day ENMO average using intraclass correlation coefficients (ICC). The same procedure was applied to the five-weekday ENMO average and the weekend ENMO average as well as the MAD measures.

ENMO and MAD values were winsorized at the age- and sex-specific 99.9th percentile. Age was categorized according to 10-year increments. Due to a potential gap of several years between recruitment into the study and the baseline examination, the final sample contains participants older than 70 years. Biological sex was categorized as woman or man. Body height and weight were measured using a Stadiometer 274 and a medical Body Composition Analyzer 515 (seca GmbH & Co. KG, Hamburg), respectively, and BMI was classified according to WHO categories: < 18.5 kg/m2 as underweight, 18.5–24.9 kg/m2 as normal weight, 25.0–29.9 kg/m2 as overweight, and ≥ 30.0 kg/m2 as obese36,37. Times of day were grouped as follows: 0:00–5:59 AM, 6:00–11:59 AM, 12:00–5:59 PM and 6:00–11:59 PM. Of note, days with time shift (due to daylight saving time) were also included, as GGIR takes this into account. The first day of accelerometer wear was used to determine the month, and the season was categorized such that spring started on 1st March. To describe the distribution and characteristics of the data, we computed accelerometer wear time and average acceleration according to population subgroups defined by age, sex, and body mass index (BMI). We also explored temporal and seasonal variation in physical activity. Descriptive and exploratory statistics were calculated using R version 4.3.138.

Results

Participant flow

We received 73,334 gt3x files, of which 471 files could not be processed due to uninformative data (e.g., file size too small). Data from further 1694 subjects were excluded due to issues that had occurred at the study centers (e.g., incorrect documentation of consent status, improper device initialization) or problems with data quality (e.g., calibration error or clipping scores exceeding threshold values). Of the resulting 71,169 individuals, we disregarded those with less than one valid weekend day, those with less than two valid weekdays, and those with an incomplete 24-h cycle, resulting in a final population for analysis of 63,236 participants (Fig. 1).

Figure 1
figure 1

Flow chart of participants. PA physical activity.

Accelerometry wear time

Missing data simulation showed that the ICC for two valid weekdays exceeded 0.9, whether compared to the average of a 7-day period or just the 5 weekdays. Likewise, the ICC for either Saturday or Sunday surpassed 0.9 when compared to the average weekend measurement (Fig. S1). Over 90% of participants had at least four valid wear days, and compliance increased with age (Fig. S2). Median wear time was consistent across weekdays, seasons, age, BMI groups, ENMO levels, and between sexes (Table S1).

Baseline characteristics and acceleration summary metrics

In the study, 52% of participants were women. Participants were on average 50.1 years (SD = 12.6) old and had a mean BMI of 26.4 kg/m2 (SD = 4.8). The average ENMO was 11.7 mg (SD = 3.7), with men showing slightly higher values (12.0 mg (SD = 4.0)) than women (11.5 mg (SD = 3.5)). The overall average MAD was 19.9 mg (SD = 6.1), with men at 20.4 mg (SD = 6.5) and women at 19.5 mg (SD = 5.7). The correlation between the winsorized (99.9th percentile) ENMO and MAD was high (r = 0.96, Supplementary Fig. S3).

For both ENMO and MAD, acceleration decreased with increasing age, it was higher in men than women, and it was higher in normal weight than underweight, overweight, or obese participants (Table 1, Fig. 2).

Table 1 Magnitude of acceleration (ENMO and MAD) by participant characteristics.
Figure 2
figure 2

Magnitude of acceleration (ENMO and MAD) by age and sex. ENMO Euclidean Norm Minus One with negative values set to zero, MAD mean amplitude deviation, see text, mg milli gravitational acceleration. n = 63,236. ENMO and MAD values were winsorized at age- and sex-specific 99.9th percentile. Interpretation of box and whiskers plot: The box depicts the interquartile range (IQR, central 50% of the distribution) with the 25% quantile and the 75% quantile as lower and upper limits, respectively, as well as the median (50% quantile, middle line); the lower whisker shows the smallest observation that is greater than or equal to the 25% quantile − 1.5 × IQR; the upper whisker depicts the largest observation that is less than or equal to the 75% quantile + 1.5 × IQR; the dots indicate outliers beyond the whiskers.

Acceleration levels were higher on weekdays compared to weekend days (Table 1). Notably, physical activity levels were distinctly lower on Sundays for both men and women across all age groups (Fig. 3). Additionally, both ENMO and MAD values peaked in the summer and were lowest in the winter, affecting both men and women in most age groups (Table 1, Fig. S4). ENMO and MAD values were highest between 12:00 PM and 05:59 PM, and lowest between 0:00 AM and 5:59 AM (Table 1). Younger participants were physically more active in the evening, whereas older subjects were physically more active in the morning (Fig. 4). The variation in the angle of the accelerometer’s longitudinal axis, indicative of its orientation in three-dimensional space, showed a clear difference between upright posture during daytime hours and a reclined posture during nighttime (Fig. S5).

Figure 3
figure 3

Weekday variation in magnitude of acceleration (A, ENMO and B, MAD) by age, and sex. ENMO Euclidean Norm Minus One with negative values set to zero, MAD mean amplitude deviation, see text, mg milli gravitational acceleration. n = 63,236. ENMO and MAD values were winsorized at age- and sex-specific 99.9th percentile. Interpretation of box and whiskers plot: The box depicts the interquartile range (IQR, central 50% of the distribution) with the 25% quantile and the 75% quantile as lower and upper limits, respectively, as well as the median (50% quantile, middle line); the lower whisker shows the smallest observation that is greater than or equal to the 25% quantile − 1.5 × IQR; the upper whisker depicts the largest observation that is less than or equal to the 75% quantile + 1.5 × IQR; the dots indicate outliers beyond the whiskers.

Figure 4
figure 4

Daytime variations in magnitude of acceleration (A, ENMO and B, MAD) by age and sex. ENMO Euclidean Norm Minus One with negative values set to zero, MAD mean amplitude deviation, see text, mg milli gravitational acceleration. n = 63,236. ENMO and MAD values were winsorized at age- and sex-specific 99.9th percentile. Shading bounds represent two standard errors.

Activity intensity categories

Participants predominantly spent their time in the lowest activity intensity category, ranging from 0 to 10 mg of ENMO or MAD (Fig. 5A,B). Time spent in any given acceleration category decreased with increasing activity intensity.

Figure 5
figure 5

Time spent in acceleration ranges based on (A) ENMO and (B) MAD by sex. ENMO Euclidean Norm Minus One with negative values set to zero, MAD mean amplitude deviation, mg milli gravitational acceleration. n = 63,236. Values were not winsorized. Interpretation of box and whiskers plot: The box depicts the interquartile range (IQR, central 50% of the distribution) with the 25% quantile and the 75% quantile as lower and upper limits, respectively, as well as the median (50% quantile, middle line); the lower whisker shows the smallest observation that is greater than or equal to the 25% quantile − 1.5 × IQR; the upper whisker depicts the largest observation that is less than or equal to the 75% quantile + 1.5 × IQR; the dots indicate outliers beyond the whiskers.

ENMO-based analyses showed that women below age 60 years spent more time in the lowest intensity category (0–10 mg) and less time in the 10–20 mg intensity category than men. Conversely, women over age 60 years spent less time in the 0–10 mg category and more time in the 10–30 mg categories than men (Fig. 6A). MAD-based analyses showed a similar pattern, with the differences being more pronounced among participants over age 60 years and less so in those under age 60 (Fig. 6B).

Figure 6
figure 6

Sex-differences of time spent in acceleration ranges based on (A) ENMO and (B) MAD by age. ENMO Euclidean Norm Minus One with negative values set to zero, MAD mean amplitude deviation, mg milli gravitational acceleration. n = 63,236. Lines represent the difference of the mean time (minutes per day) spent in acceleration categories in the group of men and the group of women. Shading bounds represent the 95% confidence interval of the two-sample t-test with the Null hypothesis: “true difference in means between group men and group women is equal to zero”. Values were not winsorized.

Time spent above various activity intensity thresholds, calculated using distinct algorithms (Box S1) is plotted in Fig. S6. Using a 1-min epoch without bout detection and applying age-specific cut-points from literature, participants under 60 years averaged 45.5 min per day (SD = 27.2) in MVPA based on the 70 mg ENMO threshold39, and 72.7 min per day (SD = 34.8) based on the 90 mg MAD threshold40.

Discussion

The NAKO collected raw, seven-day, hip-worn accelerometry data, providing plausible estimates of physical activity from over 63,000 highly adherent participants. Our derived accelerometry summary metrics showed higher physical activity in men than women, declining activity with increasing age, and temporal variation reflecting rest-activity rhythms as well as higher physical activity on working days (Monday to Saturday) versus Sundays. The derived variables are available in four levels of detail, each catering to different research needs: first, individual-level data aggregated across all valid days for overall physical activity analyses; second, week segment-level data for comparing physical activity between weekdays and weekend days; third, day-level data, aggregated by valid day, for comparing days of the week; and fourth, 15-min-level data, aggregated across all valid days, for detailed temporal analyses of physical activity.

Comparing our data to the literature poses challenges because accelerometry study protocols vary due to different practical requirements. For example, in some studies accelerometers were only worn during waking hours. Other large epidemiologic studies such as NHANES (2011–2014), the UK Biobank, the Whitehall II Study, or the Pelotas Birth Cohorts5,6,8,34 used wrist worn accelerometers, as this promises superior wear compliance41,42 and is more broadly accepted in sleep research43. Since the wrist is exposed to stronger accelerations than the hip, wrist worn accelerometers yield higher activity values than accelerometers worn at the hip35. For example, in the UK Biobank, age- and sex-specific ENMO values measured at the wrist ranged between 22.9 and 31.2 mg5, while in our study, ENMO values measured at the hip ranged between 9.5 and 12.9 mg. Recent studies utilizing ActiGraph GT3X+ devices worn at the hip align with our results. In a secondary analysis of data from 220 participants of the Iowa Bone Development Study (IBDS), the mean ENMO value was 15.5 mg (SD = 3.9)44. A Spanish study involving 209 men and women found an average ENMO of 11.5 mg (SD = 3.2)45. Another Spanish study with 42 participants reported an average ENMO of 16.0 mg (SD = 5.6) and an average MAD of 24.4 mg (SD = 6.9)35.

We excluded 7933 participants (11%) who had less than one valid weekend day, less than two valid weekdays, or an incomplete 24-h cycle to capture both weekend and weekday behavior. This exclusion rate aligns with other large-scale studies, like the UK Biobank, which also excluded 7% of data due to insufficient wear time5.

Auto-calibration was originally designed for wrist-worn accelerometry data and is less suitable for hip-worn accelerometry due to the reduced sensor orientation variability at the hip. Therefore, caution is advised when analyzing data from individuals with high acceleration levels or extended periods of non-wear time. Despite our stringent exclusion criteria, our dataset still included outliers with extreme values for time spent in MVPA or average acceleration, possibly due to calibration issues or sensor malfunctions. To address this, we recommend winsorizing such outliers at the 99.9th percentile.

We used raw accelerometry data to derive ENMO and MAD, two body acceleration summary metrics that possess certain assumptions. ENMO assumes well-calibrated sensor data with continuous representation of gravitational acceleration as 1 g. MAD assumes that the epoch mean of the signal vector reflects the gravitational acceleration and that its oscillations are always below 0.2 Hz (1 divided by MAD window length of 5 s). Nonetheless, both metrics deliver information on movement kinematics. ENMO is correlated with energy expenditure and has associations with demographic variables and health outcomes in studies using wrist-worn accelerometers like the UK Biobank and Pelotas Birth Cohorts5,8,25. In our dataset, ENMO and MAD measures were highly correlated and showed similar patterns across age and sex groups. However, MAD may represent a superior metric for hip-worn accelerometer data due to its lower sensitivity to calibration errors24,35. Of note, MAD values are known to be higher than ENMO values, thus both metrics are in certain acceleration ranges not directly comparable. Also, divergent error structures and different sensitivities to true body movement exist between ENMO and MAD. For example, ENMO may show larger error for sedentary behavior when the accelerometer is not well calibrated. Similarly, MAD may overestimate acceleration when the accelerometer rotates with frequencies that have a time period shorter than the epoch length. We focused on ENMO and MAD because those two metrics are the most extensively researched, they are easy to document, are computationally fast, have values expressed in units of gravity, and are sufficient for descriptive quality assessment as is the focus of our current investigation. We acknowledge that other metrics may prove equally valuable, and this should be explored in future studies using NAKO data as the number of possible metrics is large.

MVPA lacks a clear, measurable definition, and acceleration does not necessarily correlate with MET-based activity intensities. Also, cut points used to classify time spent in sedentary behavior, light physical activity, and MVPA have typically been derived using small study samples, making them less transferable to all populations or age groups. Additionally, the concept of MVPA varies depending on whether cut points are based on wrist or hip data46. Furthermore, categorizing continuous data leads to information loss, reduced precision, and potential bias47. Therefore, the focus in public health should be on encouraging overall physical activity rather than maximizing time spent above certain thresholds48. However, the concept of measuring time spent in MVPA is well established and widely used in accelerometry data analysis. To offer flexibility for future studies, we derived a range of MVPA estimates using literature-based thresholds39,40,49,50,51. For example, researchers could consider assessing time spent in MVPA using 1-min epochs without bout detection, employing literature-based thresholds and surrounding values as a sensitivity analysis. Although, in the age group over 70 some thresholds for ENMO as proposed in the literature are concerningly close to zero51,52 complicating a reliable distinction between variations caused by calibration error and variations caused by time in MVPA. In this case, we recommend either using higher thresholds or focussing on the full distribution of MAD metric values.

Our study has several important strengths. We produced an elaborate set of physical activity metrics using open-source software with transparent documentation of all coding steps (Table S2), which facilitates data interpretation and eases comparability with other studies. Our summary variables have been integrated into the NAKO database and are available for researchers to apply for analytical use.

However, our study has certain limitations. Wearing an accelerometer might induce reactivity effects, leading to increased physical activity, despite guidance for participants to continue their normal routines. Also, the cohort may not fully represent the general population in Germany, limiting the generalizability of our results18. While there are no clear recommendations for deriving MVPA measures, we have made all the necessary data available for comprehensive sensitivity analysis.

Conclusion

The NAKO generated plausible estimates of physical activity from hip-worn accelerometry involving over 63,000 highly compliant participants. We derived a comprehensive set of summary metrics for accelerometry, enhancing the reproducibility, utilization, and interpretation of physical activity data. These variables and the raw data are valuable for future analyses exploring associations between physical activity and disease outcomes. They can be used to statistically adjust for physical activity in multivariate models, support methodological research, and potentially identify high-risk, physically inactive population subgroups. This aligns with efforts to inform intervention strategies and guide policies targeting the WHO’s Global Action Plan goal of reducing physical inactivity by 15% by 203053.