Background & Summary

Walking is the primary means of moving about in the environment, yet the biomechanics have been described only for relatively standard conditions. In daily living, humans may adjust gait variables according to conditions, for example step length to cross a set of paving stones, or step width to traverse a narrow path. Furthermore, humans use different step lengths and step widths when navigating uneven terrain1, which may have implications for energy expenditure2,3,4,5. Experimental data of healthy walking across a broad range of conditions could help to describe how these adjustments occur, and may inform musculoskeletal- and other biomechanical models of locomotion.

We here present a biomechanics dataset of healthy human walking, covering a broad range of walking conditions. The dataset contains biomechanical variables for 33 combinations of speed (0.7–2.0 m·s−1), step length (0.5–1.1 m), and step width (0–0.4 m), varied in five types of conditions. It contains ground reaction forces and motions from healthy young adults (N = 10, age = 23.5 ± 2.5), collected using split-belt instrumented treadmill and motion capture systems respectively (for 60 s per trial). Most trials also contain 3D joint positions, forces, angles, torques and powers, obtained from inverse dynamics analysis on the lower limbs using standard software.

Portions of this data have been used in previously published studies. These include studies of redirection of the body center of mass6 and of work performed by soft tissues during both preferred7 and varied step length walking conditions8.



Ten healthy young adult participants (age 23.5 ± 2.5 years, body mass 73.5 ± 15 kg, height 1.76 ± 0.11 m, mean ± standard deviation) were enrolled in the experiments. Table 1 provides anthropometric information on the included participants. All participants provided their informed consent to participate in the experiment, which complied with all relevant ethical regulations. Both informed consent and study protocol were approved by the Institutional Review Board of the University of Michigan, where the experiment was performed.

Table 1 Anthropometric information of included participants.

Experimental protocol

Participants walked on an instrumented treadmill at 33 different combinations of average walking speed \(\bar{v}\), step length s, step frequency f, and step width w. Ground reaction forces and motions were recorded for 60 s of steady-state gait for each combination. There were five sets of constraints (see Table 2), where some gait parameters were experimentally varied (“Variable”) and some were fixed (“Fixed”) between conditions: Preferred walking at various average walking speeds \(\bar{v}\), Variable step length walking at several step lengths s but fixed step frequency f, Variable speed walking at several step frequencies f but fixed step length s, Fixed speed walking with inversely varying combinations of step length s and step frequency f, or Variable step width walking at several step widths w but fixed speed \(\bar{v}\) and fixed step frequency f. Step length s and step frequency f were varied relative to the individual preferred values s*and f*, determined from unconstrained walking at a nominal speed (v* = 1.25 m·s−1). The nominal walking speed was chosen based on previous research9, suggesting that 1.25 m·s−1 approximately corresponds to the preferred walking speed for adults. Walking speed \(\bar{v}\) and step frequency f were manipulated by setting the treadmill belt speed and asking participants to walk on the beat of an audio cue, respectively. Step length s was manipulated through both walking speed and step frequency from their ratio \(s=\bar{v}/f\). Step width w was self-selected in all conditions except Variable step width walking, where it was experimentally manipulated by asking participants to step on laser lines projected onto the treadmill surface at the specified widths. Participants generally followed experimental directions reasonably well but imperfectly, and so actual step lengths, frequencies, and widths should be obtained from the data.

Table 2 Trial lookup.

Experimental procedures

Participants were familiarized with a subset of trials during a 6-minute practice session before participating in the actual experiment. After performing a static standing trial for reference, the participant was asked to walk in a manner they preferred while the treadmill belt speed was set to 1.25 m·s−1. This first trial was used to determine the participant’s preferred step length s* and step frequency f* (see Table 1). Next, the participant practiced with a subset of trials, each lasting 30 s and together spanning the full experimental range (see Table 2). The order of practice trials was randomized for each participant individually. Next, the participant performed a series of 33 experimental testing trials, each for 60 s. Like with the practice trials, the participant was first asked to walk in a manner they preferred while the treadmill belt speed was 1.25 m·s−1. This first trial was used to reassess the participant’s preferred step length s* and step frequency f* (see Table 1). The preferred walking trial was repeated twice: once in the middle of the testing session, and once at the end. The order of the other 30 testing trials was randomized for each participant individually.

Instrumentation and data collection

Kinematic and kinetic data were collected with standard gait laboratory procedures. Ground reaction forces were measured at 1200 Hz with two force platforms (Bertec, Columbus, OH, USA) located underneath a custom split-belt treadmill. Motion capture data was collected at 120 Hz using a standard 3D system (Motion Analysis Corporation, Santa Rosa, CA, USA), synchronously with the force platform recordings. Single markers were located at the head of the 5th metatarsus, calcaneus, malleoli, knee epicondyles, greater trochanter, anterior superior iliac spin, sacrum, acromion, elbow epicondyle, and wrist. Cluster markers were attached to the shanks and thighs. Virtual markers at Helen Hayes (Davis) points10 were estimated from the pelvis markers (see Fig. 1 and Table 3). Raw forces and motions were stored in C3D files.

Fig. 1: Marker locations on a human participant and in the inverse dynamic model.
figure 1

(A) Participant with motion capture markers attached to the limbs and anatomical locations. (B) Inverse dynamics model with estimated marker locations and modelled segment positions, visualized with a skeleton stick figure. Note that trunk and arm segments are shown here for illustrative purposes but were not used in inverse dynamics analysis due to the incomplete upper body marker set.

Table 3 Marker locations.

Data analysis and export in Visual3D

Following storage in C3D format, collected forces and motions were filtered and processed with standard inverse dynamics software (Visual3D, C-Motion, Germantown, MD, USA). Visual3D software was first used to filter ground reaction forces and marker locations, employing a Butterworth low-pass filter with cut-off frequencies of 25 Hz and 6 Hz respectively, as in previous research7. Ankle, knee, and hip joints were defined based on locations of malleoli, epicondyles, and Helen Hayes (Davis) points respectively10. Joint angles were determined relative to a static standing trial. The offset in the ground reaction force was determined as the average over a manually selected time interval during which both feet were off the ground, and was subtracted from the ground reaction force signal. Inverse dynamics analysis was then performed on the processed ground reaction forces and marker locations of the lower limbs to obtain biomechanical variables, including 3D joint positions, forces, angles, torques, and powers. Filtered forces, motions and biomechanical variables were stored in CMO files. Finally, CMO data was exported to MAT files for further analysis with MATLAB software (MathWorks Inc., Natick, MA, USA), including selecting five strides of good quality (see Fig. 2).

Fig. 2
figure 2

Workflow diagram of data processing. Raw data was processed with Visual 3D and MATLAB software to yield 5 strides data files. The required files for each step are shown in italics under the arrows. The folders and fields that contain the data in each step of the process are shown under the workflow.

Data Records

The full dataset is available on figshare11. The data repository contains three folders, corresponding to three levels of data: (1) Raw data, including forces and motions in C3D format, (2) Inverse dynamics data, including computed biomechanical variables in CMO format, (3) Exported data, including biomechanical variables in MAT format (see Fig. 2). A subset of data level 3 (i.e., 5 strides data, see text below for description) has previously been made available as part of another publication8 and is also available on figshare12.

Level 1: Raw data

Raw forces and marker locations are stored in public domain C3D format, which can be opened and processed using inverse dynamics software such as Visual3D. The C3D format is supported by all major 3D Motion Capture System manufacturers. Forces and marker locations may be used by themselves or combined in inverse dynamics analysis to compute biomechanical variables like joint torque and power. Each participant has one C3D file for each trial. For example, the C3D file for participant 1 trial 1 is called p1_trial1.c3d. All participants have 33 collected trials and therefore 33 C3D files, with one exception due to a missing C3D file (see Table 1). Table 2 indicates the experimental condition and gait parameters for each trial. For example, it shows that trial 1 belongs to the variable speed condition, with average speed \(\bar{v}\) = 0.70 m·s−1 and step length s = s*.

Level 2: Inverse dynamics data

Raw data were imported and processed using Visual3D software to yield inverse dynamics data, stored in (proprietary) CMO format. The CMO file is a combination of several C3D movement files, a C3D static calibration file, and a MDH biomechanical model description file. The model description file may be optionally changed by the user. Having multiple C3D movement files within one CMO file allows simultaneously applying a single biomechanical model to multiple trials. Each participant has one CMO file, containing all 33 trials. For example, the CMO file of participant 1 is called p1.cmo. Apart from data, the CMO file contains parameter values that can be set by the user. Data, parameter values, and model description together yield inverse dynamics variables, which can be expressed in local or global coordinate systems.

Level 3: Exported data

Inverse dynamics variables were exported to MAT format, which can be opened using MATLAB software. We exported the variables through executing V3S pipeline scripts in Visual3D. A V3S pipeline script is an ASCII file that can be edited with any common word processing program or within the Visual 3D software. It instructs the Visual3D software to perform additional computation (e.g., filtering, coordinate transformation), and specifies which variables to export. Our V3S pipeline scripts call the Export_Data_To_Matfile Visual3D function to export data to MAT format. As an example, we have included a V3S pipeline script (p1export.v3s) in the software repository (see Code availability). After exporting each trial for each participant, we combined the trials into one file per participant. For example, the exported variables of participant 1 are stored in p1_AllStridesData.mat. Each file contains a 1 × 33 struct-array variable called “data” with fields: Platform, Analog, Target Data, Landmark, Time, Force, Kinetic_Kinematic, Link_Model_Based, and Participant. Each of these fields corresponds to one or more Visual3D folder(s) (see Fig. 2). In addition, we identified five strides of good quality and stored this in a separate data file. For example, the 5 strides data of participant 1 are stored in p1_5StridesData.mat. In most cases, there were many good strides within each trial, but some trials required selection to eliminate motion capture occlusions and to ensure that left-right steps landed on separate force plates.

Technical Validation

We visualized, analysed, selected and compared data in MATLAB, and provide example scripts employed for these purposes (see Code availability). Overall, data was consistent within participants, across participants, and in comparison to existing datasets. On a few occasions, data appeared missing, inconsistent, or implausible, and we decided not to select five strides of good quality. Running the provided main.m script (see Code availability) results in recreating the 5 strides data files from the exported data files (see Fig. 2).

Selecting five strides of good quality

Five strides may be selected with the MATLAB script ‘select_5strides.m’, which allows the user to select a five strides interval and visualize the five strides in comparison to the entire trial. The script outputs a MAT file called 5strides_heelstrikes.mat, containing the sample numbers that define the 5-strides interval. This MAT file is then used in another script called process_5strides.m to create the 5 strides data file.

Analysis and visualization

The MATLAB script ‘example_plotting.m’ may be used to investigate the 5 strides data for a specified biomechanical variable, participant and trial. It also creates a 3 × 3 plot of angle, moment and power for all three joints (i.e., ankle, knee, hip) (see Fig. 3). The MATLAB script ‘analyse_biomechanics_script.m’ calculates the mechanical work per stride at each joint and for the whole body. The user may clone the software repository and modify these scripts to accommodate their specific needs.

Fig. 3
figure 3

Typical example data of joint angle, moment, and power for ankle, knee, and hip. Lines indicate data from left leg (red) and right leg (yellow) of a representative participant (1), walking at nominal speed (1.25 m·s−1) with preferred step frequency and step length. For angle and moment, positive (negative) values indicate extension (flexion).

Trials without five strides of good quality

There were a few occasions where we chose not to select five strides of good quality, including the variable step width conditions for all subjects, all trials for subject 10 and three individual cases.

Variable step width walking for all subjects

We did not select 5 strides data for variable step width walking (five trials for each subject). This condition included trials with zero or near-zero step width, where subjects did not generally step on distinct treadmill belts with each foot. Consequently, one of the two force plates measures the net effect of both limbs during double support, making it impossible to know the force acting on each limb. Individual limb inverse dynamics is therefore not appropriate for these trials and the computed biomechanical variables are unreliable. We advise to only use these trials when applying additional methods to estimate the force at each limb.

All trials for subject 10

We also did not select 5 strides data for subject 10, as comparison of biomechanical variables between subjects (within trials) consistently revealed this subject to be an outlier. Further inspection indicated that the modelled pelvis was disproportionally small, potentially due to incorrect marker placement. This may have resulted in an inaccurate hip joint centre location, resulting in physiologically implausible hip torques and powers. We advise to determine the hip joint centre location using methods that are independent of pelvis marker locations, for example using helical axis methods based on thigh marker locations.

Three individual cases

We did not select 5 strides data for three additional trials (out of the remaining 249). These were trials that showed large differences with the subject-mean, and with similar trials within the same subject. We discovered issues with these trials upon further inspection, and decided to exclude them from the 5 strides dataset. These trials belonged to one of two issue-categories (1) Synchronization error and (2) Incorrect stepping on force plates. These categories are discussed in more detail below.

  1. 1.

    Synchronization error (two trials, different subjects)

    Two trials (from different subjects) showed physiologically implausible joint powers that differed considerably from similar trials within the same subject, as well as from identical trials in other subjects. Upon closer inspection, these trials (subject 3, trial 4 and subject 9, trial 14) seemed to have a delay between recorded motions and forces. This was apparent from the discrepancy between the centre of pressure time-series (from recorded forces) and ankle- and toe marker location time-series (from recorded motions), which are closely related in other trials. We therefore decided not to select five strides for these trials. We advise to only use these trials after correctly synchronizing both signals, which should be done at the level of the individual C3D files.

  2. 2.

    Incorrect stepping on force plates (one trial)

    Like in the variable step width trials, one subject also consistently stepped on one of the two force platforms in one of their other trials (subject 4 trial 1). We therefore decided not to select five steps for this trial. Like with the variable step width trials, we advise to only use this trial when applying additional methods to estimate the force at each limb.

Missing data

There were three trials (out of 330) with fully missing data and one subject with partially missing data, due to saving errors during data collection. In one of these cases (subject 7, trial 24), the C3D file was missing altogether. In another case (subject 6, trial 31), there was a C3D file, but it did not include marker locations (only forces). In the third case (subject 6, trial 21), data recording was stopped prematurely, and less than five strides were recorded. Lastly, the trials from subject 9 do not include the upper body marker positions of the acromion, elbow epicondyle, and wrist.

Comparison to another dataset

Most trials showed relatively good consistency between participants, which may be verified by running the example_plotting.m MATLAB script for various biomechanical variables. For example, the participant-mean ankle angle and ankle moment were a reasonable representation of the individual participant data for preferred walking at 1.1 m·s−1 (see left column Fig. 4). In addition, these data show qualitative agreement with previously reported data for similar conditions. For example, the participant-mean ankle angle and ankle moment for preferred walking at 1.1 m·s−1 was comparable to those for preferred walking at 1.0 m·s−1 as reported in another (open-access) data descriptor paper13 (see right column Fig. 4). The consistency across both participants and datasets supports the validity of the current dataset.

Fig. 4
figure 4

Representative data from present dataset compared to another dataset. Left column: Five strides of ankle data, during which participants walked at 1.1 m·s−1 with their preferred step length (preferred walking condition). Different colours represent different participants; the mean across participants is indicated by a black line (largely covered by the other lines). The ankle data is fairly consistent across participants. Right column: Current data (red dashed line) vs. data from another published repository13 (blue solid line). Present data agrees qualitatively with the data from a walking trial of comparable speed (1.0 m·s−1) obtained from the other dataset (mean indicated by solid line, s.d. indicated by shaded area).


The repeated-measures design of the experiment allows investigating effects of walking speed, step frequency, step length and step width on biomechanical variables such as mechanical work, force and joint moments. For example, previous studies have used this dataset to investigate how biomechanical variables like center of mass velocity6 and soft tissue work7,8 vary with gait parameters. While sufficient for investigating these particular effects, the sample size (N = 10) here may not be appropriate for some other applications. Further, it must be noted that because this dataset uses controlled, treadmill walking, its application should be limited to research questions regarding these applications. Considering biomechanical differences between treadmill and overground walking14 as well as between steady and non-steady walking15, the dataset does not address changes in self-selected, spontaneous walking speeds during overground walking16.