Multi-channel EEG recordings during a sustained-attention driving task

We describe driver behaviour and brain dynamics acquired from a 90-minute sustained-attention task in an immersive driving simulator. The data included 62 sessions of 32-channel electroencephalography (EEG) data for 27 subjects driving on a four-lane highway who were instructed to keep the car cruising in the centre of the lane. Lane-departure events were randomly induced to cause the car to drift from the original cruising lane towards the left or right lane. A complete trial included events with deviation onset, response onset, and response offset. The next trial, in which the subject was instructed to drive back to the original cruising lane, began 5–10 seconds after finishing the previous trial. We believe that this dataset will lead to the development of novel neural processing methodology that can be used to index brain cortical dynamics and detect driving fatigue and drowsiness. This publicly available dataset will be beneficial to the neuroscience and brain-computer interface communities.


Background & Summary
Driving safety has attracted public attention due to the increasing number of road traffic accidents. Risky driving states, such as fatigue and drowsiness, increase drivers' risk of crashing, as fatigue suppresses driver performance, including awareness, recognition and directional control of the car 1 . In particular, high levels of fatigue and drowsiness diminish driver arousal and information processing abilities in unusual and emergency situations 2 .
During a sustained-attention driving task, fatigue and drowsiness are reflected in driver behaviours and brain dynamics 3 . Furthermore, electroencephalography (EEG) is the preferred method for human brain electrophysiological monitoring while performing tasks involving natural movements in a real-world environment 4 . In 2003, we began conducting laboratory-based experiments collecting EEG data to investigate brain function associated with sustained attention during a safe driving task 5,6 . Our experiments have two distinct goals: (1) evaluating neurocognitive performance, i.e., determining key signatures of how the neurocognitive state of the driver (e.g., physical and physiological) varies when faced with the sensory, perceptual and cognitive demands of a sustained-attention situation [7][8][9][10] ; and (2) developing advanced computational approaches, i.e., investigating novel computational, statistical modelling and data visualisation techniques to extract signatures of neurocognitive performance, including novel analytic and algorithmic approaches for individually assessing drivers' neurocognitive state and performance [11][12][13] .
To acquire the experimental dataset, we adopted an event-related lane-departure paradigm in a virtual-reality (VR) dynamic driving simulator to quantitatively measure brain EEG dynamics along with fluctuations in behavioural performance. All of the participants were required to have a driver's licence, and none of them had a history of psychological disorders. The 32-channel EEG signals and vehicle position were recorded simultaneously, and all of the participants were instructed to sustain their attention in this driving experiment.
Several research studies on driving performance, including kinaesthetic effects, mind-wandering trends and the development of drowsiness prediction systems, have been conducted by our team using this EEG dataset. Specifically, to study EEG dynamics in response to kinaesthetic stimuli during driving, we used a VR-based driving simulator with a motion platform to produce a somatic sensation similar to real-world situations 14  www.nature.com/scientificdata www.nature.com/scientificdata/ mind-wandering trends, we investigated brain dynamics and behavioural changes in individuals experiencing low perceptual demands during a sustained-attention task 15 . In terms of the drowsiness prediction system, we proposed a brain-computer interface-based approach using spectral dynamics to classify driver alertness and predict response times [16][17][18][19][20] . We determined the amount of cognitive state information that can be extracted from noninvasively recorded EEG data and the feasibility of online assessment and rectification of brain networks exhibiting characteristic dynamic patterns in response to cognitive challenges.
These data descriptors describe a large EEG dataset in a sustained-attention driving task. We aim to help researchers reuse this dataset to further study the behavioural decision making of drivers under stress and cognitive fatigue in complex operational environments, such as car driving with kinaesthetic stimuli, which requires directly studying the interactions among the brain, behaviour, the sensory system and performance dynamics based on simultaneous measurements and joint analysis. We expect that this dataset could be used to explore principles and methods for the design of individualised real-time neuroergonomic systems to enhance the situational awareness and decision making of drivers under several forms of stress and cognitive fatigue, thereby improving total human-system performance. We believe this research will benefit the neuroscience and brain-computer interface communities.

Methods
Participants. Twenty-seven voluntary participants (age: 22-28 years) who were students or staff of the National Chiao Tung University were recruited to participate in a 90-minute sustained-attention driving task at multiple times on the same or different days. In total, 62 EEG data sets were collected from these participants. The participants had normal or corrected-to-normal vision. In addition, none of the participants reported sleep deprivation in the preceding weeks, and none had a history of drug abuse according to the self-report. Every participant was required to have a normal work and rest cycle, get enough sleep (approximately 8 h of sleep each night) and not stay up late (no later than 11:00 PM) for a week before the experiment. Additionally, the participants did not imbibe alcohol or caffeinated drinks or participate in strenuous exercise a day before the experiments. At the beginning of the experiment, a pre-test session was conducted to ensure the participants understood the instructions and to confirm that none were affected by simulator-induced nausea. This study was performed in strict accordance with the recommendations in the Guide for the Committee of Laboratory Care and Use of the National Chiao Tung University, Taiwan. The Institutional Review Board of the Veterans General Hospital, Taipei, Taiwan, approved the study. All of the participants were asked to read and sign an informed consent form before participating in the EEG experiments. The monetary compensation for one experimental session was approximately USD $20.
Virtual-reality driving environment. A VR driving environment with a dynamic driving simulator mounted on a six-degree-of-freedom Stewart motion platform was built to mirror reality behind the wheel. Six interactive highway driving scenes synchronised over local area networks were projected onto the screens at viewing angles of 0°, 42°, 84°, 180°, 276° and 318° to provide a nearly complete 360° visual field. The dimensions of the six directional scenes were 300 × 225 (width × height) cm, 290 × 225 cm, 260 × 195 cm, 520 × 195 cm, 260 × 195 cm, and 290 × 225 cm, respectively.
As shown in Fig. 1a,b, the experimental scenario involved a visually monotonous and unexciting night-time drive on a straight four-lane divided highway without other traffic. The distance from the left side to the right side www.nature.com/scientificdata www.nature.com/scientificdata/ of the road and the vehicle trajectory were quantised into values from 0-255, and the width of each lane was 60 units. The refresh rate of the scenario frame was set to emulate cruising at a speed of 100 km/hr. A real vehicle frame (Make: Ford; Model: Probe) (Fig. 1c) that included no unnecessary weight (such as an engine, wheels, and other components) was mounted on a six-degree-of-freedom Stewart motion platform (Fig. 1d). In addition, the driver's view of the VR driving environment was recorded and is shown in Fig. 1e.

Experimental paradigm.
An event-related lane-departure paradigm 3 was implemented in the VR-based driving simulator using WorldToolKit (WTK) R9 Direct and Visual C++. The paradigm was designed to quantitatively measure the subject's reaction time to perturbations during a continuous driving task. The experimental paradigm simulated night-time driving on a four-lane highway, and the subject was asked to keep the car cruising in the centre of the lane. The simulation was designed to mimic non-ideal road surface that caused the car to drift with equal probability to the right or left of the third lane. The driving task continued for 90 minutes without breaks. Drivers' activities were monitored from the scene control room via a surveillance video camera mounted on the dashboard. Lane-departure trials were obtained from experimental data collected from 2005 to 2012 at National Chiao Tung University, Taiwan.
As shown in Fig. 2a, lane-departure events were randomly induced to make the car drift from the original cruising lane towards the left or right sides (deviation onset). Each participant was instructed to quickly compensate for this perturbation by steering the wheel (response onset) to cause the car move back to the original cruising lane (response offset). To avoid the impacts of other factors during the task, participants only reacted to the lane-perturbation event by turning the steering wheel and did not have to control the accelerator or brake pedals in this experiment. Each lane-departure event was defined as a "trial, " including a baseline period, deviation onset, response onset and response offset. EEG signals were recorded simultaneously (Fig. 2b). Additionally, the corresponding directions of turning the steering wheel are shown in Fig. 2c. Of note, the next trial occurred within a 5-10 second interval after finishing the current trial, during which the subject had to maneuverer the car back to the centre line of the third car lane. If the participant fell asleep during the experiment, no feedback was provided to alert him/her.

Data records
Data recording and storage. During the experiment, the stimulus computer that generated the VR scene recoded the trajectories of the car and the events with time points in a "log" file. The stimulus computer also sent synchronised triggers (also recorded in the "log" file) to the Neuroscan EEG acquisition system. Concurrently, the Neuroscan system recoded EEG data with the time stamps of triggers in an "ev2" file. Because the number of time points in both recorded files was different, the first step was to integrate the two files into a new file with aligned event timing and behavioural data. The new event file was then imported by EEGLAB in MATLAB.
EEG signals were obtained using the Scan SynAmps2 Express system (Compumedics Ltd., VIC, Australia). Recorded EEG signals were collected using a wired EEG cap with 32 Ag/AgCl electrodes, including 30 EEG electrodes and 2 reference electrodes (opposite lateral mastoids). The EEG electrodes were placed according to a modified international 10-20 system. The contact impedance between all electrodes and the skin was kept under 5 kΩ. The EEG recordings were amplified by the Scan SynAmps2 Express system (Compumedics Ltd., VIC, Australia) and digitised at 500 Hz (resolution: 16 bits). Neuroscan's Scan 4.5 is the ultimate tool for data acquisition. The acquired raw data were saved as .cnt files on the PC and server. www.nature.com/scientificdata www.nature.com/scientificdata/ EEG signals. The raw files were read using the EEGLAB toolbox in MATLAB. The uploaded files named with set suffixes contain all of the signals. After loading the files, the "EEG.data" variable included 32 EEG signals and one signal for vehicle position. The first 32 signals were from the Fp1, Fp2, F7, F3, Fz, F4, F8, FT7, FC3, FCZ, FC4, FT8, T3, C3, Cz, C4, T4, TP7, CP3, CPz, CP4, TP8, A1, T5, P3, PZ, P4, T6, A2, O1, Oz and O2 electrodes. Two electrodes (A1 and A2) were references placed on the mastoid bones. The 33rd signal was used to describe the position of the simulated vehicle. Additionally, as shown in Table 1, the types of events (see "EEG.event.type") in the dataset were classified as deviation onset (mark: 251 or 252), response onset (mark 253) or response offset (mark 254). Of note, the time period between deviation onset and response onset was defined as reaction time (RT). Figure 3 shows an example of behavioural performance (Fig. 3a) and EEG signals (Fig. 3b) with associated events. Additionally, as shown in Table 2, we report the number of sessions per subject and include summary statistics on the number of events (including deviation onset, response onset, and response offset) per subject.
Of note, we have uploaded the raw experimental dataset 21 [file name: Multi-channel EEG recordings during a sustained-attention driving task (raw dataset)], and the pre-processed dataset 22 [file name: Multi-channel EEG recordings during a sustained-attention driving task (pre-processed dataset)) to the publicly accessible repository of figshare.
technical Validation Behavioural validation. The EEG dataset was collected from 27 subjects with normal or corrected-to-normal vision. No subjects reported a history of psychiatric disorders, neurological disease or drug use disorders. All of the subjects were recruited university students and staff at the National Chiao Tung University, Taiwan. At the beginning of the experiment, each subject wore a suitable cap for recording the electrophysiological data and was given 5 to 10 minutes to read the experimental instructions and complete the participant information sheet (questionnaire).
The subjects' facial videos and responses to the lane departure events were closely monitored. The experimenters visually observed the subjects' facial features, such as eye movements (blink rate, blink duration, long closure rate, etc.), head pose and gaze direction via the surveillance video to determine whether the subject took his/her eyes off the road. Most importantly, the behavioural data (vehicle trajectory) objectively confirmed the estimated RTs during the experiment.
The RTs reflecting the participant's promptness to respond to regular traffic events are considered an instantaneous measure of the level of fatigue and drowsiness. The RT to each lane-departure event (i.e., the time between the onset of the deviation and the onset of the response) was used as an objective behavioural measurement to characterise all EEG epochs. Three groups of epochs were defined: optimal-performance, suboptimal-performance, and poor-performance groups. Optimal-, suboptimal-, and poor-performance states might indicate that the participant performed the task with a low, intermediate, and high level of fatigue and  www.nature.com/scientificdata www.nature.com/scientificdata/ drowsiness, respectively. For each subject, the RTs collected from the first 10 minutes of the experiment were used to construct a null distribution of optimal RTs. EEG signals were recorded using Ag/AgCl electrodes attached to a 32-channel Quik-Cap (Compumedical NeuroScan). Thirty electrodes were arranged according to a modified international 10-20 system, and two reference electrodes were placed on both mastoid bones, as shown in Fig. 4a. The skin under the reference electrodes was abraded using Nuprep (Weaver and Co., USA) and disinfected with a 70% isopropyl alcohol swab before calibration. Notably, as shown in Fig. 4b, the impedance of the electrodes was calibrated to be under 5 kΩ using NaCl-based conductive gel (Quik-Gel, Neuromedical Supplies ® ). EEG signals from the electro-cap were amplified using the Scan NuAmps Express system (Compumedics Ltd., VIC, Australia) and recorded at a sampling rate of 500 Hz with 16-bit quantisation.     Table 3. The file names of the raw and pre-processed versions of the dataset. *The pre-processing steps included bandpass filters and artefact rejection. **EEG pre-processing and data analysis codes. ***Preprocessing and analysis guidelines for multi-channel EEG recordings during a sustained-attention driving task.

Subject No. Number of Sessions Numbers of Events
www.nature.com/scientificdata www.nature.com/scientificdata/ EEG validation. Consistent with previous data descriptors on practice reuse of EEG processing 23,24 , note that all EEG data including both raw and pre-processed versions, were saved in the figshare. In terms of the pre-processed dataset, all EEG data were saved after the pre-processing steps. The pre-processing steps included bandpass filters and artefact rejection. To be specific, raw EEG signals were subjected to 1-Hz high-pass and 50-Hz low-pass finite impulse response (FIR) filters. For artefact rejection, apparent eye blink contamination in the EEG signals was manually removed by visual inspection. Second, artefacts were removed by the Automatic Artifact Removal (AAR) plug-in for EEGLAB, which provided automatic correction of ocular and muscular artefacts in the EEG signals. The file names of the raw and pre-processed versions of the dataset are shown in Table 3.
Additionally, we shared this EEG dataset with our partner groups, including the University of California at San Diego (UCSD) and the DCS Corporation. Our findings are consistent with their results 25,26 , providing technical validation of this method for accurately estimating changes in driver arousal, fatigue, and vigilance levels by evaluating changes in behavioural and neurocognitive performance.

Usage Notes
The raw experimental dataset 21 and the pre-processed dataset 22 can be downloaded from the publicly accessible repository of figshare. Any user interested in this dataset do not need to register with figshare to download two versions of the datasets, the raw and pre-processed versions, to the user's personal computer. The raw and pre-processed versions of the dataset in the figshare projects are named "Multi-channel EEG recordings during a sustained-attention driving task (raw dataset)" and "Multi-channel EEG recordings during a sustained-attention driving task (pre-processed dataset)", respectively.
The data can be analysed in EEGLAB, which is a MATLAB toolbox with an interactive graphical user interface (GUI). It includes multiple functions for processing continuous and event-related EEG using independent component analysis (ICA), time/frequency analysis and other methods, including artefact rejection under multiple operation systems. EEGLAB also provides extensive tutorials (https://sccn.ucsd.edu/wiki/EEGLAB_TUTORIAL_ OUTLINE) to help researchers conduct data analyses. We recommend that researchers use EEGLAB with version 5.03 on Windows 7 or Linux.
A data analysis tutorial (named "Tutorial Data Analysis for Multi-channel EEG Recordings during a Sustained-attention Driving Task.pdf) and MATLAB codes (named "Code-availability.zip") are provided as reference material for EEG pre-processing and data analysis during a sustained-attention driving task. These items can be accessed at our figshare webpage, ensuring that researchers can easily reuse the dataset.
Additionally, we provide some key notes regarding the data analysis.
1. Load the existing dataset. Select menu item 'File' and select the 'Load existing dataset' sub-menu item. Then, select the existing dataset (e.g., s01_051017m.set) from the sub-window pop up. If users use the pre-processed dataset, each file must first be decompressed, and then, the .set file should be selected. 2. Check the workspace in MATLAB. The 'EEG' variable contains the following information: srate: sampling rate EEG.chanlocs: the number of channels EEG.event: event type and latency data: EEG signals with channels multiply times 3. Extract data epochs and conduct further data analysis. To study the event-related EEG dynamics of continuously recorded data, we must extract the data epoch time of the events of interest (for example, the data epoch time of the onsets of one class of experimental stimuli) by selecting Tools > Extract Epochs. Additionally, removing a mean baseline value from each epoch is useful when there are baseline differences between data epochs (e.g., arising from low frequency drifts or artefacts). Additionally, EEGLAB contains several functions for plotting averages of dataset trials/epochs, selecting data epochs, comparing event-related brain potential (ERP) images, working with ICA components, decomposing time/frequency information and combining multiple datasets. 4. Considering the sample size calculation, if we consider a population size with 62 copies, a 95% confidence level, and a 5% margin of error, the minimum sample size should be 54 copies.