Mobile BCI dataset of scalp- and ear-EEGs with ERP and SSVEP paradigms while standing, walking, and running

We present a mobile dataset obtained from electroencephalography (EEG) of the scalp and around the ear as well as from locomotion sensors by 24 participants moving at four different speeds while performing two brain-computer interface (BCI) tasks. The data were collected from 32-channel scalp-EEG, 14-channel ear-EEG, 4-channel electrooculography, and 9-channel inertial measurement units placed at the forehead, left ankle, and right ankle. The recording conditions were as follows: standing, slow walking, fast walking, and slight running at speeds of 0, 0.8, 1.6, and 2.0 m/s, respectively. For each speed, two different BCI paradigms, event-related potential and steady-state visual evoked potential, were recorded. To evaluate the signal quality, scalp- and ear-EEG data were qualitatively and quantitatively validated during each speed. We believe that the dataset will facilitate BCIs in diverse mobile environments to analyze brain activities and evaluate the performance quantitatively for expanding the use of practical BCIs.


Background & Summary
Human movement is a complex process that requires the integration of the central and peripheral nervous systems, and therefore, researchers have analyzed human locomotion using brain activity 1,2 .4][5] .In particular, electroencephalography (EEG) has been used as the most common method for measuring brain activity with high time resolution, portability, and ease of use 6 ; in addition, several attempts have been made to increase its practicality [7][8][9][10] .However, EEG recording in a mobile environment can cause artifacts and signal distortion, further resulting in loss of accuracy and signal quality 11,12 .Therefore, research in the mobile environment is necessary to study the brain activity during movements to mitigate limitations such as loss of accuracy and signal quality and to improve the practical BCI technology 9,12 .Moreover, several studies have developed software techniques to realize the practical BCI, such as preprocessing algorithm removing artifacts [13][14][15][16] or novel classification algorithm improving user intention performance 12,17 .
To recognize the human intention, two representative exogenous BCI paradigms, event-related potential (ERP) 18 and steady-state visual evoked potential (SSVEP) 19 , are commonly used in the mobile environment owing to their strong responses to brain activity.ERP is a time-locked brain response to stimuli (i.e., visual, auditory, etc.), including a positive peak response (P300) that occurs 300 ms after the stimulus appears.The ERP has a relatively high performance in both scalp-EEG and ear-EEG, with accuracies of 85-95% for scalp-EEG 20,21 and approximately 70% for ear-EEG 9 in a static state.SSVEP is a period brain response in the occipital area to stimuli flickering at a particular frequency.The performance of SSVEP is reliable in terms of the accuracy and signal-to-noise ratio (SNR), with 80-95% accuracy for scalp-EEG 17,20 , but 40-70% accuracy for ear-EEG as it is located far from the occipital cortex 22,23 .Brain signal data obtained by performing BCI paradigms can be used to quantitatively evaluate signal quality in a mobile environment 24 .
Portable and non-hair-bearing EEG have been frequently investigated to enhance the applicability of practical BCI in the real world [25][26][27][28][29] .In particular, ear-EEG, which comprises electrodes placed inside or around the ear, has several advantages over conventional scalp-EEG in terms of stability, portability, and unobtrusiveness 9,30 .Moreover, the signal quality of ear-EEG has been validated for recognizing human intention using several BCI paradigms, including ERP 9,29,31 , SSVEP 22,23 , and others 32 .
Recently, EEG datasets for mobile environments have been published, including motion information and different mobile environments.He et al. 33 recorded signals from 60-channel scalp-EEG, 4-channel electrooculography (EOG), and 6 goniometers from 8 participants while walking slowly at a constant speed of 0.45 m/s.They used a BCI paradigm, avatar control, by arXiv:2112.04176v1[cs.HC] 8 Dec 2021 predicting the joint angle of goniometers from the EEG while walking.Brantley et al. 34 collected a dataset consisting of full-body locomotion from 10 participants on stairs, ramps, and level grounds without BCI paradigms, by recording from 60-channel scalp-EEG, 4-channel EOG, 12-channel electromyogram (EMG), and 17 inertial measurement units (IMUs).Wagner et al. 35 recorded signals from 108-channel scalp-EEG, 2-channel EMG, 2 pressure sensors, and 3 goniometers from 20 participants without BCI paradigms while walking at a constant speed.Consequently, although these datasets were collected from the mobile environment, the movement condition was kept constant, and two of the datasets were performed without BCI paradigms.In addition, because only scalp-EEG signals were measured, the application of practical BCI was restricted.
In this study, we present a mobile BCI dataset with scalp-and ear-EEGs collected from 24 participants with BCI paradigms at different speeds.Data from 32-channel scalp-EEG, 14-channel ear-EEG, 4-channel EOG, and 27-channel IMUs were recorded simultaneously.The experimental environment involved movements of participants at different speeds of 0, 0.8, 1.6, and 2.0 m/s on a treadmill.For each speed, two BCI paradigms were used to evaluate signal quality, which facilitated diverse analysis, including time-domain analysis using ERP data and frequency-domain analysis using SSVEP data.Therefore, we believe that the dataset facilitates addressing the issues of brain dynamics in diverse mobile environments in terms of the cognitive level of multitasking, locomotion complexity, and quantitative evaluation of artifact removal methods or classifiers for BCI tasks in the mobile environment.

Participants
Twenty-four healthy individuals (14 men and10 women, 24.5 ± 2.9 years of age) without any history of neurological or lower limb pathology participated in this experiment.In the ERP tasks, all of them participated, and 17 participants performed a slight running session at a speed of 2.0 m/s.In the SSVEP tasks, 23 of them participated, excluding one because of a personal problem unrelated to the experimental procedure; 16 participants performed a slight running session at a speed of 2.0 m/s.The participants were provided the option to perform a slight running session.This study was approved by the Institutional Review Board of Korea University (KUIRB-2019-0194-01), and all participants provided written informed consent before the experiments.All experiments were conducted in accordance with the Declaration of Helsinki.

Data Acquisition
For conducting the experiment, we simultaneously collected data from three different modalities: scalp-EEG, ear-EEG, and IMU (Figure 1a, b, and c).All data can be accessed from here 36 .To synchronize the three devices, triggers were sent to the recording system of each device simultaneously while presenting a paradigm in MATLAB.
The head circumference of each participant was measured to select an appropriately sized cap of scalp-EEG.We obtained signals from the scalp with 32 EEG Ag/AgCl electrodes according to the 10/20 international system using BrainAmp (Brain Product GmbH).The ground and reference electrodes were placed on Fpz and FCz.In addition, we used four EOG channels to capture dynamically changing eye movements such as blinking.The EOG channels were placed above and below the left eye to measure the vertical eye artifacts (VEOG U and VEOG L ) as well as at the left and right temples to measure the horizontal eye artifacts (HEOG L and HEOG R ) (Figure 1b).The sampling rate of the scalp-EEG and EOG was 500 Hz with a resolution of 32 bits.All electrode impedances were maintained below 50 kΩ, and most channels were reduced to below 20 kΩ 33,34 .
The ear-EEG consists of cEEGrids electrodes located around each ear of the participant, with eight channels on the left, six channels on the right, and ground and reference channels on the right at the center (Figure 1c) 9 .Two cEEGrids were connected to a wireless mobile DC EEG amplifier (SMARTING, mBrainTrain, Belgrade, Serbia).Data were recorded at a sampling rate of 500 Hz and a resolution of 24 bits.All impedances of ear-EEG were maintained below 50 kΩ, and most channels were reduced to below 20 kΩ 33,34 .
To measure the locomotion of the participants, three wearable IMU sensors (APDM wearable technologies) were placed at the head, left ankle, and right ankle.An IMU consisted of 9-channel sensors, including a 3-axis accelerometer, 3-axis gyroscope, and 3-axis magnetometer.Therefore, 27-channel IMU signals were collected and recorded at a sampling rate of 128 Hz with a resolution of 32 bits.

Experimental Paradigm
They performed tasks under the ERP and SSVEP paradigms, in which stimuli were displayed on the monitor during each session at different speeds.Two BCI paradigms were developed based on the OpenBMI (http://openbmi.org) 20and Psychtoolbox (http://psychtoolbox.org) 37 in MATLAB (The Mathworks, Natick, MA).
During the ERP task, target ('OOO') and non-target ('XXX') characters were presented on the monitor as visual stimuli.All characters were displayed with a black background at the center of the monitor screen.The proportion of the target was 0.2, and the total number of trials was 300.In a trial, one of the stimuli was presented for 0.5 s, and a fixation cross ('+') was presented to take a break for randomly 0.5-1.5 s (Figure 1d).
During the SSVEP task, three target SSVEP visual stimuli were displayed at three positions (left, center, and right) on an LCD monitor 19,38 .The frequency range of stimuli containing 5-30 Hz is known to be appropriate for obtaining SSVEP responses 39 .It is also known that movement artifacts have a significant impact on frequency spectrum below 12 Hz 40 .Based on these studies, the stimuli were designed to flicker at 5.45, 8.57, and 12 Hz, which were calculated by dividing the monitor frame rate of 60 Hz by an integer (i.e., 60/11, 60/7, and 60/5) 20 .The participants were asked to gaze in the direction of the target stimulus highlighted in yellow.In each trial, the target of a random sequence was noticed for 2 s, after which all stimuli blinked for 5 s, with a break time for 2 s.The SSVEP experiment consisted of 20 trials for each frequency, a total of 60 trials in a session (Figure 1e).

Experimental Protocol and Procedure
Figure 1a depicts the experimental setup of this study.The participants stood on the treadmill in a lab and were instructed to look at a 24-inch monitor (refresh rate: 60 Hz, resolution: 1920 × 1080 pixels) placed 80 (± 10) cm in front of the participants.The participants were monitored and instructed to minimize other movements, such as that of neck or arm, to avoid any artifacts that might occur by movements other than walking.The mobile environment included standing, slow walking, fast walking, and slight running at speeds of 0, 0.8, 1.6, and 2.0 m/s, respectively on the treadmill (0 • inclination) 41,42 .
To proceed with the experiment under the same conditions for each participant, the experimental procedures were sequentially performed (Figure 1f).They conducted two BCI tasks while standing, slow walking, fast walking, and slight running on the treadmill.Training sessions for the ERP task were conducted at a speed of 0 m/s to train the ERP classifier prior to all ERP tasks.Duration of a session of ERP and SSVEP tasks consisted of 7-8 min, with a total of 40 min and 32 min for all sessions.Each session involved the same procedure with a random sequence of targets.All sessions for one participant were performed on a single day.Moreover, due to the possibility of fatigue and habitation, which could be induced in the sequence of the experiment 43 , the following actions were considered.At first, participants were allowed sufficient breaks between and within sessions when needed.Furthermore, the participants became familiar with the paradigm stimuli by being fully exposed to them before starting the experiment.

Preprocessing
We preprocessed the data using an open-source toolbox for EEG data, such as BBCI (https://github.com/bbci/bbci_public) 44, BCILAB (https://github.com/sccn/BCILAB) 45, and EEGLAB (https://sccn.ucsd.edu/eeglab) 46in MATLAB.At first, the data were preprocessed using a high-pass filter that was set above 0.5 Hz using a fifth-order Butterworth filter.Thereafter, three procedures: EOG removal, line noise removal, and interpolation were performed.Vertical EOG components were removed from the scalp-EEG using the flt_eog function in BCILAB 47 .The line noise removal method automatically removed artifacts that contained noise for extended periods of time with several parameters.This method removed bad channels that carried abnormal signals with standard deviations above the threshold of z-score using the function flt_clean_channels in BCILAB with threshold of 4 and window length of 5 s.The removed bad channels were interpolated using the super-fast spherical method to avoid losing any channel information.On average, 2.38 ± 1.94 channels in the scalp-EEG and 1.35 ± 1.18 channels in the ear-EEG were removed and interpolated for all participants in all sessions.All channels in scalp-EEG and ear-EEG were each re-referenced to a common average reference.We down-sampled the data from scalp-EEG, ear-EEG, and IMU sensors to 100 Hz.The continuous signal was segmented into epoched signals according to the time of each paradigm.For the ERP, each trial was segmented from -200-800 ms based on the stimulus presentation time.For the SSVEP, each trial was segmented from 0-5 s based on the starting time of the stimulus flickering.

Data Records
All data files are available in Open Science Framework repository 36 and are available under the terms of Attribution 4.0 International Creative Commons License (http://creativecommons.org/licenses/by/4.0/).

Data Format
All data are provided according to the standardized Brain Imaging Data Structure format for EEG data 48 as shown in Figure 2. The data format followed BrainVision Core Data Format, developed by Brain Products GmbH.The data file is organized with the following naming convention: sub-XX_ses-YY_task-ZZ_eeg where the session number includes 1-5, which session 1 indicates training session for ERP and session 2-5 indicates each speed of 0, 0.8, 1.6, and 2.0 m/s, respectively, and the task includes ERP and SSVEP.In 'sourcedata' folder, the data are separated into EEG and IMU because their sampling frequencies are different.In each subject folder, the sampling frequencies of data are downed to 100 Hz and data from all modalities are in a file.The number of channels for each modality is listed in Table 1.

Data Missing
The IMU data of participant 21 for ERP at 0 m/s and that of participant 12 for SSVEP at 0.8 m/s were missing because of a malfunction in the communication of the IMU during data collection.

Trials Missing
The number of trials for ERP data of participant 11 at every speed and participant 13 and 15 at a speed of 2.0 m/s, and SSVEP data of participant 14 at a speed of 2.0 m/s were approximately two-thirds of the normal number of trials because of the malfunction of device communication.

Excluded Data
The data of participant 17 at 2.0 m/s for SSVEP, participant 19 at 2.0 m/s for ERP and SSVEP, and participant 20 at 2.0 m/s for ERP were excluded since the electrodes did not adhere well during data recording, resulting in loss of more than 50% in a session.

Technical Validation
We evaluated the preprocessed data using the ERP and SSVEP paradigms in terms of accuracy and SNR.Statistical analysis was conducted using a one-tailed paired t-test to compare the performance at each moving speed to the performance at standing, as indicated by the asterisk at a confidence level of 95%.Moreover, the ERP waves and power spectral density (PSD) for SSVEP were used to evaluate the signal quality.Figure 3a depicts an example of the scalp-EEG, ear-EEG, and IMU signals for 5 s at speeds of 0, 0,8, 1.6, and 2.0 m/s.The amplitudes of the scalp-EEG, ear-EEG, and IMU increased as the speed increased 41,49,50 .Figure 3b depicts an example of the topography for the scalp-EEG and ear-EEG at different speeds of 0, 0.8, 1.6, and 2.0 m/s.The powers of the scalp-EEG and ear-EEG increased as the speed increased.

Statistical Verification
To verify the dataset, we performed statistical verification to demonstrate significant differences between the speeds across every channel of scalp-EEG and ear-EEG.Figure 4a for the ERP and Figure 4b for the SSVEP depicts the topological map of t-values in particular frequency bands, including delta waves (0.5-3.5 Hz), theta waves (3.5-7.5 Hz), alpha waves (7.5-12.5 Hz), and beta waves (12.5-30Hz) 51 .In particular, PSDs in each frequency band were analyzed using cluster-based correction with non-parametric permutation testing for multiple comparisons to verify the difference between the data at four speeds, including 0, 0.8, 1.6, and 2.0 m/s.The significance probabilities and critical values of permutation distribution are estimated using Monte-Carlo method with iterations of 10,000.
Significant channels could indicate that noise signals are included in corresponding frequency bands and speeds.The topography of the delta band depicts that step frequencies, which was mostly in the range of 0.5-3.5 Hz, affect most channels at all speeds.During slight running session, all channels, including scalp-EEG and ear-EEG, were significantly different in entire frequency band.In addition, paradigm-related areas such as the occipital area during SSVEP tasks and the central area during ERP tasks showed the large t-values in delta band, resulting in low concentration on the tasks due to the workload of multi-tasking.

Evaluation of ERP
The ERP dataset was evaluated by demonstrating ERP waves and metrics using the area under the receiver operating characteristic curve (AUC) and approximate SNR at each speed.AUC indicates the true positive rate over the false positive rate of the results.To acquire the AUC, the features of the ERP were extracted by the power over time intervals of every 50 ms from 200 ms to 450 ms.For the classification, we used a conventional classifier, regularized linear discriminant analysis, to evaluate the ERP performance.The data from the training session at a speed of 0 m/s were used for the training set, and the other dataset containing different speeds was used for the testing set.The SNR can indicate the quality of signals, and approximate SNR of ERP was calculated by the root mean square (RMS) of the amplitude of the peaks at P300 divided by the RMS of the average amplitude of the pre-stimulus baseline (-200-0 ms) at channel Pz 52,53 .
Figure 5a depicts the ERP waves of the target and non-target stimuli in the scalp-and ear-EEGs at channels Pz and L10 at each speed.The higher the speed, the lower the amplitude of the P300 components of the target in both the scalp-and ear-EEGs.Tables 2 and 3 list the performance of ERP in the scalp-EEG and ear-EEG, respectively.The grand average AUCs of ERP for all participants were 0.90 ± 0.07, and 0.67 ± 0.07 (p < 0.05) in the scalp-EEG at speeds of 0 and 1.6 m/s, respectively, and 0.72 ± 0.14 and 0.58 ± 0.06 (p < 0.05) in the ear-EEG at speeds of 0 and 1.6 m/s, respectively.The grand average SNRs of ERP for all participants were 0.95 ± 0.09 and 1.06 ± 0.14 (p < 0.05) for the scalp-EEG at speeds of 0 and 1.6 m/s, respectively, and 1.06 ± 0.27 and 0.98 ± 0.05 for the ear-EEG at speeds of 0 and 1.6 m/s, respectively.

Evaluation of SSVEP
The SSVEP dataset was evaluated by implementing statistical analysis to measure the signal properties using PSD, and the metrics using accuracy and approximate SNR at each speed.Accuracy was measured as the percentage of correct predictions in the total number of cases.A canonical correlation analysis was used for the classification that does not require the training datasets.The SNR of SSVEP was calculated using the ratio of the power of the target frequencies to the power of the neighboring frequencies (resolution: 0.25 Hz, number of neighbors: 12) 54 .
Figure 5b depicts the PSD of the SSVEP for the scalp-EEG and ear-EEG at channels Oz and L10 at each speed.The higher the speed, the greater the power in all frequency spectra for both the scalp-and ear-EEGs.Tables 4 and 5 list the performance of the SSVEP for scalp-EEG and ear-EEG, respectivelyThe grand average accuracies of SSVEP for all participants were 88.70 ± 19.52% and 80.65 ± 20.38% (p < 0.05) for scalp-EEG at speeds of 0 and 1.6 m/s, respectively, and 53.19 ± 13.93 and 39.57 ± 6.39 (p < 0.05) for ear-EEG at speeds of 0 and 1.6 m/s, respectively.The grand average SNRs of SSVEP for all participants were 2.64 ± 0.99 and 1.92 ± 0.68 (p < 0.05) for the scalp-EEG at speeds of 0 and 1.6 m/s, respectively, and 1.21 ± 0.23 and 1.03 ± 0.10 (p < 0.05) for the ear-EEG at speeds of 0 and 1.6 m/s, respectively.

Usage Notes
This mobile dataset is available in the BrainVision Core Data Format.For analyzing the dataset, we recommend using a common open-source toolbox for EEG data, such as BBCI (https://github.com/bbci/bbci_public) 44, OpenBMI (http://openbmi.org) 20, and EEGLAB (https://sccn.ucsd.edu/eeglab) 46in the MATLAB environment, or MNE (https://martinos.org/mne) 55in the Python environment.The supporting code is available on GitHub (https://github.com/DeepBCI/Deep-BCI).For the preprocessing, we recommend performing down-sampling to give all signals equal sampling frequency, filtering out extremely low frequency below 0.1 Hz at least to remove the DC drift using a high-pass filter, and interpolating the high distributed channels among all channels.This dataset can be used for the performance evaluation of artifact removal methods and analysis of mental states with quantitative evaluation via BCI paradigms in a mobile environment.8/17     [V]

Figure 1 .Figure 2 .
Figure 1.Experimental design.(a) Experimental setup while standing (0 m/s), slow walking (0.8 m/s), fast walking (1.6 m/s), and slight running (2.0 m/s) on the treadmill, wearing scalp-EEG, ear-EEG, EOG, and IMUs.Informed consent was obtained from the participant for publishing the figure.Channel placement of (b) scalp-EEG with EOG and (c) ear-EEG.Experimental paradigms for (d) ERP paradigm with 300 trials and (e) SSVEP paradigm with 60 trials.(f) Experimental procedure.

Figure 3 . 3 -Figure 4 .
Figure 3. Examples of the signals and topography at different speeds.(a) Time-synchronized subset of scalp-EEG, ear-EEG, and IMU data for 5 s while moving at different speeds of 0, 0.8, 1.6, and 2.0 m/s.The EOG V channel was calculated by subtracting lower VEOG from upper VEOG.(b) EEG power topography in each channel of scalp-EEG and ear-EEG.

Table 1 .
Grand average of all participants of ERP and SSVEP waveforms according to four different speeds of 0, 0.8, 1.6, and 2.0 m/s.(a) Grand average of all participants of ERP waves of target and non-target stimuli in scalp-EEG and ear-EEG for 1 s from -200-800 ms according to the trigger at different speeds.(b) Grand average PSD of all participants for SSVEP in scalp-EEG (left) and ear-EEG (right) at different speeds.The dash line indicated the target frequency, such as 5.45, 8.57, and 12 Hz.Description of channel types.Asterisk indicates significance levels of 1% between the performance at 0 m/s and corresponding speed.

Table 2 .
AUC and SNR of ERP in scalp-EEG.Asterisk indicates significance levels of 5% between the performance at 0 m/s and corresponding speed.

Table 3 .
AUC and SNR of ERP in ear-EEG.Asterisk indicates significance levels of 5% between the performance at 0 m/s and corresponding speed.

Table 4 .
Accuracy and SNR of SSVEP in scalp-EEG.Asterisk indicates significance levels of 5% between the performance at 0 m/s and corresponding speed.

Table 5 .
Accuracy and SNR of SSVEP in ear-EEG.Asterisk indicates significance levels of 5% between the performance at 0 m/s and corresponding speed. 17/17