Background & Summary

Human movement is a complex process that requires the integration of the central and peripheral nervous systems, and therefore, researchers have analyzed human locomotion using brain activity1,2. Brain-computer interface (BCI) has been studied based on the communication between human thoughts and external devices to recover the motor sensory function of disabled patients and support daily life of healthy people3,4,5. In particular, electroencephalography (EEG) has been used as the most common method for measuring brain activity with high time resolution, portability, and ease of use6; in addition, several attempts have been made to increase its practicality7,8,9,10. However, EEG recording in a mobile environment can cause artifacts and signal distortion, further resulting in loss of accuracy and signal quality11,12. Therefore, research in the mobile environment is necessary to study the brain activity during movements to mitigate limitations such as loss of accuracy and signal quality and to improve the practical BCI technology9,12. Moreover, several studies have developed software techniques to realize the practical BCI, such as preprocessing algorithm removing artifacts13,14,15,16 or novel classification algorithm improving user intention performance12,17.

To recognize the human intention, two representative exogenous BCI paradigms, event-related potential (ERP)18 and steady-state visual evoked potential (SSVEP)19, are commonly used in the mobile environment owing to their strong responses to brain activity. ERP is a time-locked brain response to stimuli (i.e., visual, auditory, etc.), including a positive peak response (P300) that occurs 300 ms after the stimulus appears. The ERP has a relatively high performance in both scalp-EEG and ear-EEG, with accuracies of 85–95% for scalp-EEG20,21 and approximately 70% for ear-EEG9 in a static state. SSVEP is a period brain response in the occipital area to stimuli flickering at a particular frequency. The performance of SSVEP is reliable in terms of the accuracy and signal-to-noise ratio (SNR), with 80–95% accuracy for scalp-EEG17,20, but 40–70% accuracy for ear-EEG as it is located far from the occipital cortex22,23. Brain signal data obtained by performing BCI paradigms can be used to quantitatively evaluate signal quality in a mobile environment24.

Portable and non-hair-bearing EEG have been frequently investigated to enhance the applicability of practical BCI in the real world25,26,27,28,29. In particular, ear-EEG, which comprises electrodes placed inside or around the ear, has several advantages over conventional scalp-EEG in terms of stability, portability, and unobtrusiveness9,30. Moreover, the signal quality of ear-EEG has been validated for recognizing human intention using several BCI paradigms, including ERP9,29,31, SSVEP22,23, and others32.

Recently, EEG datasets for mobile environments have been published, including motion information and different mobile environments. He et al.33 recorded signals from 60-channel scalp-EEG, 4-channel electrooculography (EOG), and 6 goniometers from 8 participants while walking slowly at a constant speed of 0.45 m/s. They used a BCI paradigm, avatar control, by predicting the joint angle of goniometers from the EEG while walking. Brantley et al.34 collected a dataset consisting of full-body locomotion from 10 participants on stairs, ramps, and level grounds without BCI paradigms, by recording from 60-channel scalp-EEG, 4-channel EOG, 12-channel electromyogram (EMG), and 17 inertial measurement units (IMUs). Wagner et al.35 recorded signals from 108-channel scalp-EEG, 2-channel EMG, 2 pressure sensors, and 3 goniometers from 20 participants without BCI paradigms while walking at a constant speed. Consequently, although these datasets were collected from the mobile environment, the movement condition was kept constant, and two of the datasets were performed without BCI paradigms. In addition, because only scalp-EEG signals were measured, the application of practical BCI was restricted.

In this study, we present a mobile BCI dataset with scalp- and ear-EEGs collected from 24 participants with BCI paradigms at different speeds. Data from 32-channel scalp-EEG, 14-channel ear-EEG, 4-channel EOG, and 27-channel IMUs were recorded simultaneously. The experimental environment involved movements of participants at different speeds of 0, 0.8, 1.6, and 2.0 m/s on a treadmill. For each speed, two BCI paradigms were used to evaluate signal quality, which facilitated diverse analysis, including time-domain analysis using ERP data and frequency-domain analysis using SSVEP data. Therefore, we believe that the dataset facilitates addressing the issues of brain dynamics in diverse mobile environments in terms of the cognitive level of multitasking, locomotion complexity, and quantitative evaluation of artifact removal methods or classifiers for BCI tasks in the mobile environment.

Methods

Participants

Twenty-four healthy individuals (14 men and10 women, 24.5 ± 2.9 years of age) without any history of neurological or lower limb pathology participated in this experiment. In the ERP tasks, all of them participated, and 17 participants performed a slight running session at a speed of 2.0 m/s. In the SSVEP tasks, 23 of them participated, excluding one because of a personal problem unrelated to the experimental procedure, and 16 participants performed a slight running session at a speed of 2.0 m/s. The participants were provided the option to perform a slight running session. This study was approved by the Institutional Review Board of Korea University (KUIRB-2019-0194-01), and all participants provided written informed consent before the experiments. All experiments were conducted in accordance with the Declaration of Helsinki.

Data acquisition

For conducting the experiment, we simultaneously collected data from three different modalities: scalp-EEG, ear-EEG, and IMU (Fig. 1a–c). All data can be accessed from here36. To synchronize the three devices, triggers were sent to the recording system of each device simultaneously while presenting a paradigm in MATLAB.

Fig. 1
figure 1

Experimental design. (a) Experimental setup while standing (0 m/s), slow walking (0.8 m/s), fast walking (1.6 m/s), and slight running (2.0 m/s) on the treadmill, wearing scalp-EEG, ear-EEG, EOG, and IMUs. Informed consent was obtained from the participant for publishing the figure. Channel placement of (b) scalp-EEG with EOG and (c) ear-EEG. Experimental paradigms for (d) ERP paradigm with 300 trials and (e) SSVEP paradigm with 60 trials. (f) Experimental procedure.

The head circumference of each participant was measured to select an appropriately sized cap of scalp-EEG. We obtained signals from the scalp with 32 EEG Ag/AgCl electrodes according to the 10/20 international system using BrainAmp (Brain Product GmbH). The ground and reference electrodes were placed on Fpz and FCz. In addition, we used four EOG channels to capture dynamically changing eye movements such as blinking. The EOG channels were placed above and below the left eye to measure the vertical eye artifacts (VEOGu and VEOGL) as well as at the left and right temples to measure the horizontal eye artifacts (HEOGL and HEOGR) (Fig. 1b). The sampling rate of the scalp-EEG and EOG was 500 Hz with a resolution of 32 bits. All electrode impedances were maintained below 50 kΩ, and most channels were reduced to below 20 kΩ33,34.

The ear-EEG consists of cEEGrids electrodes located around each ear of the participant, with eight channels on the left, six channels on the right, and ground and reference channels on the right at the center (Fig. 1c)9. Two cEEGrids were connected to a wireless mobile DC EEG amplifier (SMARTING, mBrainTrain, Belgrade, Serbia). Data were recorded at a sampling rate of 500 Hz and a resolution of 24 bits. All impedances of ear-EEG were maintained below 50 kΩ, and most channels were reduced to below 20 kΩ33,34.

To measure the locomotion of the participants, three wearable IMU sensors (APDM wearable technologies) were placed at the head, left ankle, and right ankle. An IMU consisted of 9-channel sensors, including a 3-axis accelerometer, 3-axis gyroscope, and 3-axis magnetometer. Therefore, 27-channel IMU signals were collected and recorded at a sampling rate of 128 Hz with a resolution of 32 bits.

Experimental paradigm

They performed tasks under the ERP and SSVEP paradigms, in which stimuli were displayed on the monitor during each session at different speeds. Two BCI paradigms were developed based on the OpenBMI (http://openbmi.org)20 and Psychtoolbox (http://psychtoolbox.org)37 in MATLAB (The Mathworks, Natick, MA).

During the ERP task, target (‘OOO’) and non-target (‘XXX’) characters were presented on the monitor as visual stimuli. All characters were displayed with a black background at the center of the monitor screen. The proportion of the target was 0.2, and the total number of trials was 300. In a trial, one of the stimuli was presented for 0.5 s, and a fixation cross (‘+’) was presented to take a break for randomly 0.5–1.5 s (Fig. 1d).

During the SSVEP task, three target SSVEP visual stimuli were displayed at three positions (left, center, and right) on an LCD monitor19,38. The frequency range of stimuli containing 5–30 Hz is known to be appropriate for obtaining SSVEP responses39. It is also known that movement artifacts have a significant impact on frequency spectrum below 12 Hz40. Based on these studies, the stimuli were designed to flicker at 5.45, 8.57, and 12 Hz, which were calculated by dividing the monitor frame rate of 60 Hz by an integer (i.e., 60/11, 60/7, and 60/5)20. The participants were asked to gaze in the direction of the target stimulus highlighted in yellow. In each trial, the target of a random sequence was noticed for 2 s, after which all stimuli blinked for 5 s, with a break time for 2 s. The SSVEP experiment consisted of 20 trials for each frequency, a total of 60 trials in a session (Fig. 1e).

Experimental protocol and procedure

Figure 1a depicts the experimental setup of this study. The participants stood on the treadmill in a lab and were instructed to look at a 24-inch monitor (refresh rate: 60 Hz, resolution: 1920 × 1080 pixels) placed 80 (±10) cm in front of the participants. The participants were monitored and instructed to minimize other movements, such as that of neck or arm, to avoid any artifacts that might occur by movements other than walking. The mobile environment included standing, slow walking, fast walking, and slight running at speeds of 0, 0.8, 1.6, and 2.0 m/s, respectively on the treadmill (0° inclination)41,42.

To proceed with the experiment under the same conditions for each participant, the experimental procedures were sequentially performed (Fig. 1f). They conducted two BCI tasks while standing, slow walking, fast walking, and slight running on the treadmill. Training sessions for the ERP task were conducted at a speed of 0 m/s to train the ERP classifier prior to all ERP tasks. Duration of a session of ERP and SSVEP tasks consisted of 7–8 min, with a total of 40 min and 32 min for all sessions. Each session involved the same procedure with a random sequence of targets. All sessions for one participant were performed on a single day. Moreover, due to the possibility of fatigue and habitation, which could be induced in the sequence of the experiment43, the following actions were considered. At first, participants were allowed sufficient breaks between and within sessions when they needed. Furthermore, the participants became familiar with the paradigm stimuli by being fully exposed to them before starting the experiment.

Preprocessing

We preprocessed the data using an open-source toolbox for EEG data, such as BBCI (https://github.com/bbci/bbci_public)44, BCILAB (https://github.com/sccn/BCILAB)45, and EEGLAB (https://sccn.ucsd.edu/eeglab)46 in MATLAB. At first, the data were preprocessed using a high-pass filter that was set above 0.5 Hz using a fifth-order Butterworth filter. Thereafter, three procedures: EOG removal, line noise removal, and interpolation were performed. Vertical EOG components were removed from the scalp-EEG using the flt_eog function in BCILAB47. The line noise removal method automatically removed artifacts that contained noise for extended periods of time with several parameters. This method removed bad channels that carried abnormal signals with standard deviations above the threshold of z-score using the function flt_clean_channels in BCILAB with threshold of 4 and window length of 5 s. The removed bad channels were interpolated using the super-fast spherical method to avoid losing any channel information. On average, 2.38 ± 1.94 channels in the scalp-EEG and 1.35 ± 1.18 channels in the ear-EEG were removed and interpolated for all participants in all sessions. All channels in scalp-EEG and ear-EEG were each re-referenced to a common average reference. We down-sampled the data from scalp-EEG, ear-EEG, and IMU sensors to 100 Hz. The continuous signal was segmented into epoched signals according to the designated trigger timing of each paradigm. For the ERP, each trial was segmented from −200 to 800 ms based on the stimulus presentation time. For the SSVEP, each trial was segmented from 0–5 s based on the starting time of the stimulus flickering.

Data Records

All data files are available in Open Science Framework repository36 and are available under the terms of Attribution 4.0 International Creative Commons License (http://creativecommons.org/licenses/by/4.0/).

Data format

All data are provided according to the standardized Brain Imaging Data Structure format for EEG data48 as shown in Fig. 2. The data format followed BrainVision Core Data Format, developed by Brain Products GmbH. The data file is organized with the following naming convention:

$${\rm{sub}} \mbox{-} {\rm{XX}}\_{\rm{ses}} \mbox{-} {\rm{YY}}\_{\rm{task}} \mbox{-} {\rm{ZZ}}\_{\rm{eeg}}$$

where the session number includes 1–5, which session 1 indicates training session for ERP and session 2–5 indicates each speed of 0, 0.8, 1.6, and 2.0 m/s, respectively, and the task includes ERP and SSVEP. In ‘sourcedata’ folder, the data are separated into EEG and IMU because their sampling frequencies are different. In each subject folder, the sampling frequencies of data are downed to 100 Hz and data from all modalities are in a file. The number of channels for each modality is listed in Table 1.

Fig. 2
figure 2

Data folder structure. The folders and files are described, including (a) raw data and (b) preprocessed data in the data repository. The folder was named with ‘sub-XX’, ‘ses-YY’, and modality, and the file was named with ‘sub-XX_ses-YY_task-ZZ_WW’. The ‘sub-XX’ indicated the participant identifier, including 1–24, the ‘ses-YY’ indicated the session number, including training(01), standing(02), slow walking(03), fast walking(04), and slight running(05), the ‘task-ZZ’ indicated executed BCI paradigms including ERP and SSVEP, and the modality ‘WW’ indicated the modality of each data, including EEG (scalp-EEG and ear-EEG) and IMU.

Table 1 Description of channel types and information of three modalities.

Missing data

Data missing

The IMU data of participant 21 for ERP at 0 m/s and that of participant 12 for SSVEP at 0.8 m/s were missing because of a malfunction in the communication of the IMU during data collection.

Trials missing

The number of trials for ERP data of participant 11 at every speed and participant 13 and 15 at a speed of 2.0 m/s, and SSVEP data of participant 14 at a speed of 2.0 m/s were approximately two-thirds of the normal number of trials because of the malfunction of device communication.

Excluded data

The data of participant 17 at 2.0 m/s for SSVEP, participant 19 at 2.0 m/s for ERP and SSVEP, and participant 20 at 2.0 m/s for ERP were excluded since the electrodes did not adhere well during data recording, resulting in loss of more than 50% in a session.

Technical Validation

Figure 3a depicts an example of the scalp-EEG, ear-EEG, and IMU signals for 5 s at speeds of 0, 0,8, 1.6, and 2.0 m/s. The amplitudes of the scalp-EEG, ear-EEG, and IMU increased as the speed increased41,49,50. Figure 3b depicts an example of the topography for the scalp-EEG and ear-EEG at different speeds of 0, 0.8, 1.6, and 2.0 m/s. The powers of the scalp-EEG and ear-EEG increased as the speed increased. To evaluated the dataset, statistical analysis was conducted using a one-tailed paired t-test to compare the performance at each moving speed to the performance at standing, as indicated by the asterisk at a confidence level of 95%. We quantitatively evaluated the preprocessed data of the ERP and SSVEP paradigms in terms of accuracy and SNR. Moreover, the baseline corrected waves for ERP and power spectral density (PSD) for SSVEP were plotted to evaluate the signal quality.

Fig. 3
figure 3

Examples of the signals and topography at different speeds. (a) Time-synchronized subset of scalp-EEG, ear-EEG, and IMU data for 5 s while moving at different speeds of 0, 0.8, 1.6, and 2.0 m/s. The EOGV channel was calculated by subtracting lower VEOG from upper VEOG. (b) EEG power topography in each channel of scalp-EEG and ear-EEG.

Statistical verification

To verify the dataset, we performed statistical verification to demonstrate significant differences between the speeds across every channel of scalp-EEG and ear-EEG. Figure 4a for the ERP and Fig. 4b for the SSVEP depicts the topological map of t-values in particular frequency bands, including delta waves (0.5–3.5 Hz), theta waves (3.5–7.5 Hz), alpha waves (7.5–12.5 Hz), and beta waves (12.5–30 Hz)51. In particular, PSDs in each frequency band were analyzed using cluster-based correction with non-parametric permutation testing for multiple comparisons to verify the difference between the data at four speeds, including 0, 0.8, 1.6, and 2.0 m/s. The significance probabilities and critical values of permutation distribution are estimated using Monte-Carlo method with iterations of 10,000.

Fig. 4
figure 4

Statistical differences of PSD in each frequency band for scalp- and ear-EEGs between standing and other speeds while (a) ERP and (b) SSVEP. The colored topological maps indicate t-values and the electrodes in cluster showing a statistically significant effect on spectral power between the data of corresponding speeds are marked with black asterisk (p < 0.05, cluster-based correction for multiple comparison).

Significant channels could indicate that noise signals are included in corresponding frequency bands and speeds. The topography of the delta band depicts that step frequencies, which was mostly in the range of 0.5–3.5 Hz, affect most channels at all speeds. During slight running session, all channels, including scalp-EEG and ear-EEG, were significantly different in entire frequency band. In addition, paradigm-related areas such as the occipital area during SSVEP tasks and the central area during ERP tasks showed the large t-values in delta band, resulting in low concentration on the tasks due to the workload of multi-tasking.

Evaluation of ERP

The ERP dataset was evaluated by demonstrating ERP waves and metrics using the area under the receiver operating characteristic curve (AUC) and approximate SNR at each speed. AUC indicates the true positive rate over the false positive rate of the results. To acquire the AUC, the features of the ERP were extracted by the power over time intervals of every 50 ms from 200 ms to 450 ms. For the classification, we used a conventional classifier, regularized linear discriminant analysis, to evaluate the ERP performance. The data from the training session at a speed of 0 m/s were used for the training set, and the other dataset containing different speeds was used for the testing set. The SNR can indicate the quality of signals, and approximate SNR of ERP was calculated by the root mean square (RMS) of the amplitude of the peaks at P300 divided by the RMS of the average amplitude of the pre-stimulus baseline (−200 to 0 ms) at channel Pz52,53.

Figure 5a depicts the baseline corrected waves of the target and non-target in the scalp- and ear-EEGs at channels Pz and L10 at each speed. The higher the speed, the lower the amplitude of the P300 components of the target in both the scalp- and ear-EEGs. Tables 2 and 3 list the performance of ERP in the scalp-EEG and ear-EEG, respectively. The grand average AUCs of ERP for all participants were 0.90 ± 0.07, and 0.67 ± 0.07 (p < 0.05) in the scalp-EEG at speeds of 0 and 1.6 m/s, respectively, and 0.72 ± 0.14 and 0.58 ± 0.06 (p < 0.05) in the ear-EEG at speeds of 0 and 1.6 m/s, respectively. The grand average SNRs of ERP for all participants were 0.95 ± 0.09 and 1.06 ± 0.14 (p < 0.05) for the scalp-EEG at speeds of 0 and 1.6 m/s, respectively, and 1.06 ± 0.27 and 0.98 ± 0.05 for the ear-EEG at speeds of 0 and 1.6 m/s, respectively.

Fig. 5
figure 5

Grand average of all participants of ERP and SSVEP waveforms according to four different speeds of 0, 0.8, 1.6, and 2.0 m/s. (a) Grand average baseline corrected waves of all participants for ERP of target and non-target in scalp-EEG at Pz and ear-EEG at L10 for 1 s from −200 to 800 ms according to the trigger at different speeds. (b) Grand average PSD of all participants for SSVEP in scalp-EEG at Oz (left) and ear-EEG at L10 (right) at different speeds. The dashed line indicated the target frequency, such as 5.45, 8.57, and 12 Hz.

Table 2 AUC and SNR of ERP in scalp-EEG.
Table 3 AUC and SNR of ERP in ear-EEG.

Evaluation of SSVEP

The SSVEP dataset was evaluated by implementing statistical analysis to measure the signal properties using PSD, and the metrics using accuracy and approximate SNR at each speed. Accuracy was measured as the percentage of correct predictions in the total number of cases. A canonical correlation analysis was used for the classification that does not require the training datasets. The SNR of SSVEP was calculated using the ratio of the power of the target frequencies to the power of the neighboring frequencies (resolution: 0.25 Hz, number of neighbors: 12)54.

Figure 5b depicts the PSD of the SSVEP for the scalp-EEG and ear-EEG at channels Oz and L10 at each speed. The higher the speed, the greater the power in all frequency spectra for both the scalp- and ear-EEGs. Tables 4 and 5 list the performance of the SSVEP for scalp-EEG and ear-EEG, respectively. The grand average accuracies of SSVEP for all participants were 88.70 ± 19.52% and 80.65 ± 20.38% (p < 0.05) for scalp-EEG at speeds of 0 and 1.6 m/s, respectively, and 53.19 ± 13.93 and 39.57 ± 6.39 (p < 0.05) for ear-EEG at speeds of 0 and 1.6 m/s, respectively. The grand average SNRs of SSVEP for all participants were 2.64 ± 0.99 and 1.92 ± 0.68 (p < 0.05) for the scalp-EEG at speeds of 0 and 1.6 m/s, respectively, and 1.21 ± 0.23 and 1.03 ± 0.10 (p < 0.05) for the ear-EEG at speeds of 0 and 1.6 m/s, respectively.

Table 4 Accuracy and SNR of SSVEP in scalp-EEG.
Table 5 Accuracy and SNR of SSVEP in ear-EEG.

Usage Notes

This mobile dataset is available in the BrainVision Core Data Format. For analyzing the dataset, we recommend using a common open-source toolbox for EEG data, such as BBCI (https://github.com/bbci/bbci_public)44, OpenBMI (http://openbmi.org)20, and EEGLAB (https://sccn.ucsd.edu/eeglab)46 in the MATLAB environment, or MNE (https://martinos.org/mne)55 in the Python environment. The supporting code is available on GitHub (https://github.com/DeepBCI/Deep-BCI). For the preprocessing, we recommend performing down-sampling to give all signals equal sampling frequency, filtering out extremely low frequency below 0.1 Hz at least to remove the DC drift using a high-pass filter, and interpolating the high distributed channels among all channels. This dataset can be used for the performance evaluation of artifact removal methods and analysis of mental states with quantitative evaluation via BCI paradigms in a mobile environment.