Background & Summary

Natural human movements are complex and adaptable, involving highly coordinated sensorimotor processing in multiple cortical and subcortical areas1,2,3,4. However, many experiments focusing on the neural basis of human upper-limb movements often study constrained, repetitive motions such as center-out reaching within a controlled laboratory setup5,6,7,8,9. Such studies have greatly increased our knowledge about the neural correlates of movement, but it remains unclear how well these findings generalize to the natural movements that we often make in everyday situations10,11. Human upper-limb movement studies have incorporated self-cued and less restrictive movements12,13,14,15,16, but focusing on unstructured, naturalistic movements can enhance our knowledge of the neural basis of motor behaviors17, help us understand the role of neurobehavioral variability18,19, and aid in the development of robust brain-computer interfaces for real-world use20,21,22,23,24,25,26.

Here, we present synchronized intracranial neural recordings and upper body pose trajectories opportunistically obtained from 12 human participants while they performed unconstrained, naturalistic movements over 3–5 recording days each (55 days total). Intracranial neural activity, recorded via electrocorticography (ECoG), involves placing electrodes directly on the cortical surface, beneath the skull and dura, to provide high spatial and temporal resolution27,28,29. Pose trajectories were obtained from concurrent video recordings using computer vision to automate the often-tedious annotation procedure that has previously precluded the creation of similar datasets30,31. Along with these two core datastreams, we have added extensive metadata, including thousands of wrist movement initiation events previously used for neural decoding32,33, 10 quantitative event-related features describing the type of movement performed and any relevant context18, coarse labels describing the participant’s behavioral state based on visual inspection of videos34, and 14 different electrode-level features18. This dataset, which we call AJILE12 (Annotated Joints in Long-term Electrocorticography for 12 human participants), builds on our previous AJILE dataset35 and is depicted in Fig. 1.

Fig. 1
figure 1

Schematic overview of our Annotated Joints in Long-term Electrocorticography for 12 human participants (AJILE12) dataset. AJILE12 includes ECoG recordings and upper body pose trajectories for 12 participants across 55 total recordings days, along with a variety of behavioral, movement event-related, and electrode-level metadata. All data is stored on The DANDI Archive in the NWB data standard, and we have created a custom browser-based dashboard in Jupyter Python to facilitate data exploration without locally downloading the data files.

AJILE12 has high reuse value for future analyses because it is large, comprehensive, well-validated, and shared in the NWB data standard. We have included 55 days of semi-continuous intracranial neural recordings along with thousands of verified wrist movement events, which both greatly exceed the size of typical ECoG datasets from controlled experiments36 as well as other long-term naturalistic ECoG datasets34,35,37,38. Such a wealth of data improves statistical power and enables large-scale exploration of more complex behaviors than previously possible, especially with modern machine learning techniques such as deep learning32,39,40,41,42. In addition, AJILE12 contains comprehensive metadata, including coarse behavior labels, quantitative event features, and localized electrode positions in group-level coordinates that enable cross-participant comparisons of neural activity. We have also pre-processed the neural data and visually validated all 6931 wrist movement events to ensure high-quality data, which have been already used in multiple studies18,32,33. In addition, we have released AJILE12 in the NWB data standard (Table 1)43 to adhere to the FAIR data principles of findability, accessibility, interoperability, and reusability44. Unified, open-source data formats such as NWB enable researchers to easily access the data and apply preexisting, reusable workflows instead of starting from scratch. Furthermore, we have developed an accessible and interactive browser-based dashboard that visualizes neural and pose activity, along with relevant metadata. This dashboard can access AJILE12 remotely to visualize the data without requiring local data file downloads, improving AJILE12’s accessibility.

Table 1 The main variables contained in each data file.

Methods

Participants

We collected data from 12 human participants (8 males, 4 females; 29.4 ± 7.6 years old [mean ± SD]) during their clinical epilepsy monitoring at Harborview Medical Center (Seattle, USA). See Table 2 for individual participant details. Each participant had been implanted with electrocorticography (ECoG) electrodes placed based on clinical need. We selected these participants because they were generally active during their monitoring and had ECoG electrodes located near motor cortex. All participants provided written informed consent. Our protocol was approved by the University of Washington Institutional Review Board.

Table 2 Individual participant characteristics.

Data collection

Semi-continuous ECoG and video were passively recorded from participants during 24-hour clinical monitoring for epileptic seizures. Recordings lasted 7.4 ± 2.2 days (mean ± SD) for each participant with sporadic breaks in monitoring (on average, 8.3 ± 2.2 breaks per participant each lasting 1.9 ± 2.4 hours). For all participants, we only included recordings during days 3–7 following the electrode implantation surgery to avoid potentially anomalous neural and behavioral activity immediately after the surgery. We excluded recording days with corrupted or missing data files, as noted in Table 2, and stripped all recording dates to de-identify participant data. These long-term, clinical recordings include various everyday activities, such as eating, sleeping, watching television, and talking while confined to a hospital bed. ECoG and video sampling rates were 1 kHz and 30 FPS (frames per second), respectively.

ECoG data processing

We used custom MNE-Python scripts to process the raw ECoG data45. First, we removed DC drift by subtracting out the median voltage at each electrode. We then identified high-amplitude data discontinuities, based on abnormally high electrode-averaged absolute voltage (>50 interquartile ranges [IQRs]), and set all data within 2 seconds of each discontinuity to 0.

With data discontinuities removed, we then band-pass filtered the data (1–200 Hz), notch filtered to minimize line noise at 60 Hz and its harmonics, downsampled to 500 Hz, and re-referenced to the common median for each grid, strip, or depth electrode group. For each recording day, noisy electrodes were identified based on abnormal standard deviation (>5 IQRs) or kurtosis (>10 IQRs) compared to the median value across electrodes. Using this procedure, we marked on average 7.3 ± 5.6 ECoG electrodes as bad during each participant’s first available day of recording (Table 2).

Electrode positions were localized using the Fieldtrip toolbox in MATLAB. This process involved co-registering preoperative MRI and postoperative CT scans, manually selecting electrodes in 3D space, and warping electrode positions into MNI space (see Stolk et al.46 for further details).

Markerless pose estimation

We performed markerless pose estimation on the raw video footage using separate DeepLabCut models for each participant31. First, one researcher manually annotated the 2D positions of 9 upper-body keypoints (nose, ears, wrists, elbows, and shoulders) during 1000 random video frames for each participant (https://tinyurl.com/human-annotation-tool). Frames were randomly selected across all recording days, with preference towards frames during active, daytime periods. These 1000 frames correspond to 0.006% of the total frames from each participant’s video recordings. These manually annotated frames were used to train a separate DeepLabCut neural network model for each participant (950 frames for training, 50 frames for validation). The model architecture was a convolutional neural network that was 50 layers deep (ResNet-50). We then applied the trained model to every video frame for that participant to generate estimated pose trajectories.

We synchronized ECoG data and pose trajectories using video timestamps and combined multiple recording sessions so that each file contained data from one entire 24-hour recording day that started and ended at midnight47.

Wrist movement event identification

We used the estimated pose trajectories in order to identify unstructured movement initiation events of the wrist contralateral to the implanted hemisphere. To identify movement events, a first-order autoregressive hidden semi-Markov model was applied to the pose trajectory of the contralateral wrist. This model segmented the contralateral wrist trajectory into discrete move or rest states. Movement initiation events were identified as state transitions where 0.5 seconds of rest was followed by 0.5 seconds of wrist movement (see Singh et al.33 for further details).

Next, we selected the movement initiation events that most likely corresponded to actual reaching movements. We excluded arm movements during sleep, unrelated experiments, and private times based on coarse behavioral labels, which are described in the next section. In addition, we only retained movement events that (1) lasted between 0.5–4 seconds, (2) had DeepLabCut confidence scores >0.4, indicating minimal marker occlusion, and (3) had parabolic wrist trajectories, as determined by a quadratic fit to the wrist’s radial movement (\({R}^{2} > 0.6\)). We used this quadratic fit criterion to eliminate outliers with complex movement trajectories. For each recording day, we selected up to 200 movement events with the highest wrist speeds during movement onset. Finally, we visually inspected all selected movement events and removed those with occlusions or false positive movements (17.8% ± 9.9% of events [meanSD]).

For each movement event, we also extracted multiple, quantitative behavioral and environmental features. To quantify movement trajectories, we defined a reach as the maximum radial displacement of the wrist during the identified movement event, as compared to wrist position at movement onset. Movement features include reach magnitude, reach duration, 2D vertical reach angle (90 for upward reaches, −90 for downward reaches), and radial speed during movement onset. We also include the recording day and time of day when each movement event occurred, as well as an estimate of speech presence during each movement using audio recordings.

In addition, we quantified the amount of bimanual movement for event based on ipsilateral wrist movement. These features include a binary classification of bimanual/unimanual based on temporal lag between wrist movement onsets, the ratio of ipsilateral to contralateral reach magnitude, and the amount of each contralateral move state that temporally overlapped with an ipsilateral move state. The binary feature was bimanual if at least 4 frames (0.13 seconds) of continuous ipsilateral wrist movement began either 1 second before contralateral wrist movement initiation or anytime during the contralateral wrist move state. Please see Peterson et al.18 for further methodological details.

Coarse behavioral labels

To improve wrist movement event identification, we performed coarse annotation of the video recordings every 3 minutes. These behavioral labels were either part of a blocklist to avoid during event detection or general activities/states that the participant was engaged in at the time. Identified activities include sleep/rest, inactive, and active behaviors, which were further subdivided into activities such as talking, watching TV, and using a computer or phone (Fig. 2). Blocklist labels include times where event detection would likely be inaccurate, such as camera movement and occlusion, as well as private times and unrelated research experiments. Some participants also have clinical procedure labels, indicating times when the clinical staff responded to abnormal participant behavior. We upsampled all labels to match the 30 Hz sampling rate of the pose data. Tables 3 and 4 show the duration of each label across participants for activity and blocklist labels, respectively.

Fig. 2
figure 2

Coarse behavior labelling. (a) We annotated participant behavior in the video recordings using hierarchical labels to detail common awake and active behaviors. These annotations also include blocklist labels, which indicate times to potentially avoid during data exploration. (b) We show an example of the behavior labels for participant P01 during the entirety of recording day 4. Sleep/rest occurs in the morning and night times, as expected, with predominantly active periods during the day (8:00–20:00). Bottom row shows detailed active labels during a 4-hour active period that is dominated mostly by talk and TV behaviors. Note that these detailed active labels can overlap in time.

Table 3 Coarse activity label durations (in hours) for each participant.
Table 4 Coarse blocklist label durations (in hours) for each participant.

Data Records

The data files are available on The DANDI Archive (https://doi.org/10.48324/dandi.000055/0.220127.0436)47, in the Neurodata Without Borders: Neurophysiology 2.0 (NWB:N) format43. All datastreams and metadata have been combined into a single file for each participant and day of recording, as indicated by the file name. For example, sub-01_ses-3_behavior+ecephys.nwb contains data from participant P01 on recording day 3. We used PyNWB 1.4.0 to load and interact with these data files. Table 1 shows the location of all main variables within each data file.

Each file contains continuous ECoG and pose data over a 24-hour period, with units of and pixels, respectively. ECoG data is located under\acquisition\ElectricalSeries as a pynwb.ecephys.ElectricalSeries variable. Pose data can be found under\processing\behavior\data_interfaces\Position as an pynwb.behavior.Position variable. Pose data is provided for the left/right ear (L_Ear, R_Ear), shoulder (L_Shoulder, R_Shoulder), elbow (L_Elbow, R_Elbow), and wrist (L_Wrist, R_Wrist), as well as the nose (Nose).

In addition to these core datastreams, each file contains relevant metadata. Contralateral wrist movement events are located in\processing\behavior\data_interfaces\ReachEvents as an ndx_events.events.Events variable. Quantitative neural and behavioral features for each event can be found in\intervals\reaches as a pynwb.epoch.TimeIntervals table with columns for each feature. Coarse behavioral labels are included in\intervals\epochs as a pynwb.epoch.TimeIntervals table. Each row contains the label along with the start and stop time in seconds.

We also include electrode-specific metadata in\electrodes as a hdmf.common.table.DynamicTable. Columns contain different metadata features, such as Montreal Neurological Institute (MNI) x, y, z coordinates and electrode group names. Electrode groups were named by clinicians based on their location in the brain. This table also contains the standard deviation, kurtosis, and median absolute deviation for each electrode computed over the entire recording file (excluding non-numeric values). Electrodes that we identified as noisy based on abnormal standard deviation and kurtosis are marked as False under the ‘good’ column. Table 2 shows the number of good electrodes that remain for each participant during the first available day of recording. We have also included the \({R}^{2}\) scores obtained from regressing ECoG spectral power on the 10 quantitative event features for each participant’s wrist movement events18. Low-frequency power (used for low_freq_R2) indicates power between 8–32 Hz, while high-frequency power (used for high_freq_R2) denotes power between 76–100 Hz.

Technical Validation

In this section, we assess the technical quality of AJILE12 by validating our two core datastreams: intracranial neural recordings and pose trajectories. In addition to this assessment, we have previously validated the quality and reliability of AJILE12 in multiple published studies18,32,33. We validated ECoG data quality by assessing spectral power projected into common brain regions48. This projection procedure enables multi-participant comparisons despite heterogeneous electrode coverage and reduces the dimensionality of the ECoG data from 64 or more electrodes (Fig. 3(a)) to a few brain regions of interest18,32. For this analysis, we focused on 4 sensorimotor and temporal regions in the left hemisphere defined using the AAL2 brain atlas48,49: precentral gyrus, postcentral gyrus, middle temporal gyrus, and inferior temporal gyrus. For participants with electrodes implanted primarily in the right hemisphere, we mirrored electrode positions into the left hemisphere. We divided the neural data into 30-minute windows and applied Welch’s method to compute the median spectral power over non-overlapping 30-second sub-windows50. We excluded 30-minute windows with non-numeric data values, likely due to data breaks. On average, we used 160.4 ± 30.6 windows per participant (80.2 ± 15.3 hours) across all recording days. Spectral power was interpolated to integer frequencies and projected into the 4 predefined brain regions (see Peterson et al.18 for further methodological details).

Fig. 3
figure 3

Validation of intracranial neural signal quality. (a) Electrocorticography (ECoG) electrode positions are shown in MNI coordinates for each participant. ECoG power spectra is shown for (b) all 12 participants (shading denotes standard deviation) and (c) participant P01 over all available half-hour time windows. We projected spectral power into sensorimotor and temporal brain regions, excluding time windows with non-numeric values that likely indicated a data break. Lines for participant P01 denote power in each window (\(n=130\) total, or 65 hours). The power spectra shape (exponential decrease for increasing frequencies) and consistency over time demonstrate the cleanliness and stability of our neural recordings across multiple recording days.

Figure 3(b) shows the average spectral power across time windows, separated by participant. In general, power spectra remain quite consistent across participants with tight standard deviations across time windows, indicating that much of the ECoG data is good to use51,52. We also plotted the power spectra of each individual window for participant P01, as shown in Fig. 3(c). Again, the variation among time windows appears small, and we see clear differences in spectral power between sensorimotor (pre/postcentral gyri) and temporal areas, as expected. Additionally, we retained 92.3% ± 6.3% ECoG electrodes per participant (Table 2), further demonstrating the quality of our neural data53,54.

We validated pose trajectories by comparing each pose estimation model’s output to our manual annotations of each participant’s pose (Table 5). While manual annotations are susceptible to human error55, they are often used to evaluate markerless pose estimation performance when marker-based motion capture is not possible30,56. We used root-mean-square (RMS) error averaged across all keypoints to evaluate model performance for the 950 frames used to train the model as well as 50 annotated frames that were withheld from training. RMS errors for the holdout set (5.71 ± 1.90 pixels) are notably larger than the train set errors (1.52 ± 0.12 pixels), as expected, but are still within an acceptable tolerance given that 3 pixels are approximately equal to just 1 cm33.

Table 5 Pose estimation model errors.

Usage Notes

We have developed a Jupyter Python dashboard that can be run online to facilitate data exploration without locally downloading the data files (https://github.com/BruntonUWBio/ajile12-nwb-data). Our dashboard includes visualizations of electrode locations, along with ECoG and wrist pose traces for a user-selected time window (Fig. 4). Users can also visualize the average contralateral wrist trajectory during identified movement events for each file. The dashboard streams from The DANDI Archive only the data needed for visualization, enabling efficient renderings of time segments from the large, 24-hour data files. Our code repository also includes all scripts necessary to create Figs. 2, 3 and Tables 24. In addition, we have previously used AJILE12 to decode and analyze the neurobehavioral variability of naturalistic wrist movements and have publicly released multiple workflows that can be modified for use on this dataset18,32,33.

Fig. 4
figure 4

Browser-based Jupyter Python dashboard for dataset exploration. We designed a browser-based dashboard, available at https://github.com/BruntonUWBio/ajile12-nwb-data, to facilitate exploration of AJILE12 without needing to download any data files locally. (a) Participant keypoint positions are displayed for the first sample of a user-defined time window, with the option to animate keypoint positions across the entire window. We included a virtual neck marker for this visualization at the midpoint between the left and right shoulders. (b) Time-series traces of horizontal (x) and vertical (y) wrist positions are displayed over the same selected time window. (c) Electrode coverage is shown in MNI coordinates on a standardized brain model. This visualization is interactive, allowing three-dimensional rotations, alterations of hemisphere opacity to inspect depth electrodes, and the ability to visualize various electrode-level metadata such as electrode groups and identified bad electrodes. (d) Raw ECoG signals are visualized over the same user-selected time window, color-coded by electrode group.