An open-access dataset of naturalistic viewing using simultaneous EEG-fMRI

Telesford, Qawi K.; Gonzalez-Moreira, Eduardo; Xu, Ting; Tian, Yiwen; Colcombe, Stanley J.; Cloud, Jessica; Russ, Brian E.; Falchier, Arnaud; Nentwich, Maximilian; Madsen, Jens; Parra, Lucas C.; Schroeder, Charles E.; Milham, Michael P.; Franco, Alexandre R.

doi:10.1038/s41597-023-02458-8

Download PDF

Data Descriptor
Open access
Published: 23 August 2023

An open-access dataset of naturalistic viewing using simultaneous EEG-fMRI

Scientific Data volume 10, Article number: 554 (2023) Cite this article

4755 Accesses
4 Altmetric
Metrics details

Subjects

Abstract

In this work, we present a dataset that combines functional magnetic imaging (fMRI) and electroencephalography (EEG) to use as a resource for understanding human brain function in these two imaging modalities. The dataset can also be used for optimizing preprocessing methods for simultaneously collected imaging data. The dataset includes simultaneously collected recordings from 22 individuals (ages: 23–51) across various visual and naturalistic stimuli. In addition, physiological, eye tracking, electrocardiography, and cognitive and behavioral data were collected along with this neuroimaging data. Visual tasks include a flickering checkerboard collected outside and inside the MRI scanner (EEG-only) and simultaneous EEG-fMRI recordings. Simultaneous recordings include rest, the visual paradigm Inscapes, and several short video movies representing naturalistic stimuli. Raw and preprocessed data are openly available to download. We present this dataset as part of an effort to provide open-access data to increase the opportunity for discoveries and understanding of the human brain and evaluate the correlation between electrical brain activity and blood oxygen level-dependent (BOLD) signals.

Microdosing with psilocybin mushrooms: a double-blind placebo-controlled study

Article Open access 02 August 2022

Memorability shapes perceived time (and vice versa)

Article 22 April 2024

EEG is better left alone

Article Open access 09 February 2023

Background & Summary

Simultaneous collection of electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) data is an attractive approach to imaging as it combines the high spatial resolution of fMRI with the high temporal resolution of EEG. Combining modalities allows researchers to integrate spatial and temporal information while overcoming the limitations of a single imaging modality^1,2. Nevertheless, collecting multimodal data simultaneously requires specific expertise, and researchers must overcome various technical challenges to successfully collect data. Such challenges may limit its broader usage in the research community.

There are several technical challenges encountered when collecting imaging modalities simultaneously. With EEG, the main challenge is due to various sources of noise that impact the recorded signal. Gradient artifact is the most significant source of noise in simultaneous recordings, caused by the magnetic field gradients during fMRI acquisition, which induce current into EEG electrodes³. Another noise source is the ballistocardiogram (BCG) signal, which captures the ballistic forces of blood in the cardiac cycle^4,5. The BCG artifact arises from the pulsation of arteries in the scalp that causes movement in EEG electrodes and generates voltage. The BCG artifact is more pronounced in a strong magnetic field and increases with field strength⁶. In addition to gradient and BCG artifacts, other noise sources include the MRI helium compressor⁷, eye blinks⁸, head movement, and respiratory artifacts⁹. Additionally, while collecting fMRI data, one of the main issues is patient discomfort while wearing the EEG cap in the scanner, which can cause increased head motion. Likewise, preparation time for collecting both datasets can also increase participant burden. Collecting simultaneous fMRI and EEG requires overcoming a variety of technical challenges but also needs advanced preprocessing techniques to overcome these unavoidable artifacts and produce a cleaner signal. In this paper, we detail how we addressed various technical challenges encountered when recording simultaneous EEG-fMRI including strategies to improve data quality.

For this dataset, most of the tasks performed by the participants are naturalistic viewing tasks. Naturalistic stimuli represent paradigms considered more complex and dynamic than task-based stimuli^10,11. Naturalistic viewing provides more physiologically relevant conditions and produces closer to real-world brain responses^12,13,14. Naturalistic stimuli also contain narrative structure and provide context that reflects real-life experiences^14,15. Moreover, movies have been found to have high intersubject correlation and reliability^16,17, hold subjects’ attention¹⁸, and improves compliance related to motion and wakefulness¹⁹. Naturalistic movies are also an ideal stimulus for multimodal data sets and may be useful in linking responses across levels^20,21 and species²².

In this manuscript, we present a dataset collected at the Nathan S. Kline Institute for Psychiatric Research (NKI) in Orangeburg, NY, representing a study using simultaneously collected EEG and fMRI in healthy adults. The dataset contains multiple task conditions across two scans, including a visual task, resting state, and naturalistic stimuli. We also present quality control metrics for both modalities and describe preprocessing steps to clean up the EEG data. Lastly, we openly share these raw and processed data through the International Neuroimaging Data-Sharing Initiative (INDI) along with preprocessing code available on GitHub.

Methods

Participants and procedures

Simultaneous EEG-fMRI was collected in twenty-two adults (ages 23–51 years; mean age: 36.8; 50% male) recruited from the Rockland County, NY community. Participants enrolled in this study have no history of psychiatric or neurological illnesses. All imaging was collected using a 3 T Siemens TrioTim equipped with a 12-channel head coil. EEG data were collected using an MR-compatible system by Brain Products consisting of the BrainCap MR with 64 channels, two 32-channel BrainAmp MR amplifiers, and a PowerPack battery. Cortical electrodes were arranged according to the international 10–20 system. Inside the scanner, eye tracking was collected in the left eye using the EyeLink 1000 Plus.

Participants attended two sessions between 2 and 354 days between scans (time between scans, mean: 38.2 days; median: 11 days); see Table 1 for the breakdown of data acquired during sessions. The scanning protocol consisted of three recording settings. The “Outside” setting was an EEG recording collected outside the MRI scanner in a non-shielded room; the “Scanner OFF” setting consisted of EEG recordings collected inside the static field of the MRI scanner while the scanner was off; the “Scanner ON” setting consisted the simultaneous EEG and fMRI recordings. All research performed was approved by NKIs Institutional Review Board (IRB# 941632). Prior to the experiment, written informed consent was obtained from all participants. Participants also provided demographic information and behavioral data, including information on their last month of sleep (Pittsburgh Sleep Study)²³, the amount of sleep they had the previous night, and their caffeine intake before the scan session.

Table 1 Simultaneous EEG-fMRI experimental design. EEG and fMRI were collected simultaneously across two scan sessions.

Full size table

EEG acquisition

The entire procedure of data collection takes approximately three hours. About 45 minutes are spent preparing a participant for EEG. Preparation begins with measuring the participant’s head to determine cap size. An anterior to posterior measurement is also taken across the top of the head, from the nasion to the inion. A mark is made at the center of the forehead 10% of the measured nasion-inion distance. A proper sized cap is placed on the participant’s head with the Fpz electrode centered at the mark on the forehead. Once the subject is fitted with the EEG cap, electrodes are filled with electrolyte gel. In this study, EEG is collected using a customized cap to record 61 cortical channels, two electrooculogram (EOG) channels placed above (channel 64) and below the left eye (channel 63), and one electrocardiography (ECG) channel (channel 32) placed on the back. In addition, the cap also contains a reference and ground electrode. Electrodes were filled using V19 Abralyt HiCl electrode gel. Electrode impedance was recorded before every recorded run; to ensure good data quality, electrode impedance was kept below 20kOhm. EEG was recorded using BrainVision Recorder at a sampling rate of 5 kHz. After cap preparation, participants completed a single run of the flickering checkerboard experiment in the “Outside” scan condition viewing a 19-inch LCD monitor. After the Outside scan, a 3D scan of the participant’s head is collected to digitize the position of EEG electrodes. 3D scans were collected using a portable scanner, the Occipital Structure Sensor (Occipital Inc, Boulder CO), and iPad Mini 4 (Apple, Cupertino, CA). Due to protected health information (PHI) restrictions, 3D scans will not be available in the data release; however, location files containing the positions of the electrodes will be provided with this release.

Simultaneous EEG-fMRI recording in the MRI scanner

After 3D digitization, participants enter the MRI scanner and are placed on the scanner bed in the supine position. Cushions are placed around the head to provide stabilization and minimize head motion during scans. At this stage, participants are provided with MR-safe goggles if they have any visual impairment that requires glasses. The screen is rear projected at the end of the MRI scanner at 1300 mm to the mirror mounted on the head coil. The videos are shown on a projector screen with a size of 440 × 330 mm with a 1024 × 768 resolution. This creates a horizontal and vertical viewing angle of 19.35° and 14.51° respectively, with a resolution of 0.0189 degrees/pixel for both directions. The light in the room was kept on during the imaging session. Participants were also fitted with a respiratory transducer belt to monitor breathing, which was recorded using BIOPAC MP150 (BIOPAC Systems, Inc., Goleta, CA).

Once positioned in the scanner, the participant’s head is enclosed in a 12-channel 3 T head matrix MR coil. The cable bundle (flat ribbon) from the EEG cap is routed through the front of the head coil and fixed at the top of the head coil using medical tape. The cable bundles are connected to the amplifiers and battery pack. During EEG data acquisition, the software is synchronized to the master clock of the MRI scanner. During recordings, EEG data was continuously collected along with task onset triggers, and volume triggers were recorded at the beginning of each TR. For specifics on the equipment and connections for simultaneous recordings, see Fig. 1 and Table 2.

Table 2 Equipment used for data collection and stimuli presentation in EEG-fMRI study.

Full size table

Eye tracking acquisition

For recordings collected inside the scanner, eye position and pupil dilation were recorded using an infrared-based eye tracker (EyeLink 1000 Plus, SR Research Ltd., Ontario, Canada; http://www.sr-research.com) at a sampling rate of 1000 Hz. Prior to release, data were down sampled to 250 Hz. The eye tracker was calibrated using a 9-point grid before recordings in the MRI scanner. Participants were asked to direct their gaze at dots presented on the grid. Calibration was followed by a validation step until the error between the two measurements was less than 1° ²⁴.

MRI data acquisition

MRI data were acquired using a 12-channel head coil on 3.0 T Siemens TIM Trio. MPRAGE structural T1w images were acquired with the following parameters: TR = 2500 ms; TI = 1200 ms; TE = 2.5 ms; slices = 192; matrix size = 256 × 256; voxel size = 1 mm³ isotropic; flip angle = 8°; partial Fourier off; pixel bandwidth = 190 Hz/Px. All BOLD fMRI sequences were acquired with these parameters: TR = 2100 ms; TE = 24.6 ms; Flip Angle = 60°; slices = 38; matrix size = 64 × 64; voxel size = 3.469 × 3.469 × 3.330 mm. The run length for each task is listed in Table 1.

Task data/stimuli description

In this section, we describe the data generated for this study focused on collecting simultaneous EEG and fMRI. This dataset consists of a task, naturalistic stimuli, and resting state data. In addition to the simultaneously collected data, task data was collected outside the MRI scanner and inside the scanner environment with the scanner off. The collection of this data enables us to assess the impact of changes in the scanning environment on the EEG recordings. EEG-fMRI data were collected across two scan sessions; structural data was collected during the middle of the scan (see Table 1 for details). Code for presenting task stimuli and naturalistic stimuli, along with code to preprocess EEG and fMRI imaging data is available on GitHub (https://github.com/NathanKlineInstitute/NATVIEW_EEGFMRI).

Checkerboard stimulus

The use of flickering visual stimuli has been used to investigate the visual system in EEG^25,26 and fMRI^27,28. A high-contrast flickering checkerboard was used to stimulate the primary visual cortical regions. Participants were shown a flickering radial checkerboard at a frequency of 12 Hz in 20-second trials following a 20-second rest period across five repetitions. The checkerboard stimulus was presented outside the scanner after cap placement, inside the scanner with the scanner off, and inside the scanner during a simultaneous EEG and fMRI. These three recordings were collected to measure the impact of the MRI scanning sequence on the recorded EEG.

Rest

The participant is presented with a white fixation cross in the center of a black screen and instructed to rest with eyes open. Participants had one rest scan per session, and each scan had a duration of 10 minutes.

Inscapes

Inscapes is a computer-generated animation featuring abstract 3D shapes and moves in slow continuous transitions. The video was originally developed as a 7-minute video for children to watch during brain scans as a means to provide stimulation to keep them engaged while minimizing some cognitive processing that may be engaged¹⁹. Participants were presented with an extended version of Inscapes that was 10 minutes long. Similar to the rest scan, the Inscapes video was viewed once per scan session.

Predictive eye estimation regression (PEER) calibration

Predictive eye estimation regression (PEER) is an imaging-based calibration scan used to estimate the direction of eye gaze²⁹ and here can be used as a complement to the optical eye tracking of the Eyelink 1000. Participants were told to direct their gaze at dots that appeared at predefined points on the screen. The PEER method estimates eye gaze from the collected scan using support vector regression (SVR). The algorithm estimates the direction of eye gaze during each repetition (TR) in the fMRI time series.

Naturalistic stimuli (Movies)

Participants viewed three movies twice in one scan session. Videos varied between 258 s and 600 s. Naturalistic stimuli included “The Present” [4m18s] (uploaded to YouTube 7 Feb 2016)³⁰, two 10-minute clips from “Despicable Me” [clips taken from Russian Blu-Ray with exact times 1:02:09-1:12:09 (English) and 0:12:12-0:22:12 (Hungarian)]³¹, and three 5-minute monkey videos³². The monkey videos are part of a database with multiple videos³³; for this study, Monkey 1, Monkey 2, and Monkey 5 represent the first, second, and fifth videos in the database, respectively. The videos used in this study are available to download in the GitHub repository (https://github.com/NathanKlineInstitute/NATVIEW_EEGFMRI/tree/main/stimulus).

Limitations

The fMRI imaging in this dataset was collected using a 12-channel head coil on a 3 T scanner platform. While we possess the ability to scan at 3 T with a 32-channel head coil, the head coil design does not allow for the EEG cap cable bundle to be routed perpendicularly from the top of the head. Moreover, the design of the head coil does not permit alternate routing of the cable bundle because it will block the participant’s eyes or the cable length is too short to reach the amplifiers. In this study, the cable bundle is routed through the front and then taped to the top of the head coil before it is connected to the amplifier. Using a 12-channel head coil limited the sequences that could be used to collect fMRI, including the use of multiband sequences. Another limiting factor is that imaging sequences with faster TRs could not be used while collecting data simultaneously. The main issue is a safety concern regarding radiofrequency (RF) power deposition that causes heating in the EEG leads and electrodes during a sequence³⁴. In this study, we used a TR of 2100 ms to ensure participants were not at risk of discomfort or burns due to the heating of electrodes.

We have not been able to calculate the delay between the stimulus computer and the projector, due to lack of equipment to conduct the appropriate measurement (photodiode)³⁵. If in the future we do purchase this equipment and are able to perform this measurement, this information will be posted on the study website.

EEG preprocessing

We developed an automated pipeline to preprocess all EEG data collected outside and inside the MRI scanner. The preprocessing methods used on the EEG data depended on where data were acquired. All EEG data were preprocessed using EEGLAB and associated plugins³⁶. For data collected inside the scanner, data were preprocessed using the FMRIB plug-in for EEGLAB, provided by the University of Oxford Centre for Functional MRI of the Brain (FMRIB)^37,38.

For data collected in the Outside setting, the following preprocessing steps were used: (i) bandpass filter using a Hamming windowed sinc FIR filter between 0.5 Hz and 70 Hz; (ii) reference electrodes using average reference, excluding the ECG channel, EOG channels, and electrodes excluded during the EEG quality control process.

For data collected in the Scanner OFF setting, the initial preprocessing steps used in the Outside setting were used. In addition, pulse artifact detection and removal were used due to the increased contamination of the signal caused by the participant’s heartbeat³: (i) QRS/heartbeat detection using the ECG channel; (ii) pulse artifact/BCG removal using template subtraction based on the median artifact; (iii) bandpass filter using a Hamming windowed sinc FIR filter between 0.5 Hz and 70 Hz; (iv) reference electrodes using average reference, excluding the ECG channel and EOG channels.

For data collected in the Scanner ON setting, the initial preprocessing steps used in the Scanner OFF setting were used. In addition, gradient artifact removal was used to remove the contamination of the EEG signal caused by the changing gradients from the fMRI pulse sequence³⁷: (i) gradient artifact removal; (ii) QRS/heartbeat detection using the ECG channel; (iii) pulse artifact/BCG removal using template subtraction based on the median artifact; (iv) bandpass filter using a Hamming windowed sinc FIR filter between 0.5 Hz and 70 Hz; (v) reference electrodes using average reference, excluding the ECG channel and EOG channels.

Gradient artifact removal

The gradient artifact is the most significant noise source in simultaneous EEG-fMRI data, measuring more than 400 times larger than the lowest amplitude EEG events³. The FMRIB plugin uses the FASTR method to remove gradient artifacts³⁸. The method requires recording the scanner trigger at the start of each TR. An average template is computed from the detected TRs and subtracted from the raw EEG data. Following this process, the data is corrected further using principal component analysis (PCA) to reduce residual artifacts. Residual artifacts are further reduced using adaptive noise cancellation³.

Pulse artifact/BCG removal

ECG data collected inside the MRI scanner has a pronounced T wave that increases as the field strength increases³⁹. The FMRIB plugin identifies these QRS events using an algorithm that detects events, aligns them, and corrects for false positives and negatives^40,41. A median signal is computed from the events to create an artifact template, which is subsequently subtracted from the data.

MRI preprocessing

We used the Connectome Computation System (CCS) to preprocess the MRI/fMRI data⁴². For the anatomic data, we performed skull stripping using a combination of Brain Extraction Toolbox (BET) and Freesurfer. Data was then segmented (Freesurfer) and registered to a template space (MNI152 2006) using FLIRT and MCFLIRT^43,44. The runs for the fMRI data were all preprocessed equally. Initially, the first five volumes are discarded, then the data is despiked and slice time and motion corrected. The functional data is skull striped with 3dAutomask, refined using the structural data, and registered to the anatomical images using boundary-based registration based on N4 using Freesurfer. Nuisance correction is done using the Friston 24 motion parameters, average CSF, and WM signals, with/without global signal regression (GSR). The data is also processed with/without temporal filtering (0.01–0.1 Hz) and with/without a 6 mm FWHM spatial filter. Time series were extracted from 400 ROIS (Schaefer 400 atlas) to be further processed⁴⁵.

Data quality control

EEG

To assess the quality of the EEG data in this study, we followed the quality control pipeline similar to that described in Delorme et al.⁴⁶. This approach yields three metrics related to data quality: (i) percent of “good” channels; (ii) percent of “good” trials; and (iii) number of independent components (ICs) related to brain source activity. From this pipeline, “good” channels are defined as those that remain after completing the related preprocessing steps: removal of channels with more than five seconds of non-activity, with signal greater than four standard deviations due to high-frequency noise, or Pearson’s correlation coefficient less than 0.7 with nearby channels. Accordingly, “good” trials are related to the data periods that are not contaminated by artifacts such as body movement. In this study, we removed data segments with a variance higher than twenty times the variance of the calibration data. Finally, independent component analysis (ICA) was computed using the RunICA plugin for the EEGLAB toolbox to produce ICs⁴⁷. Afterward, the ICLabel plugin was used to identify ICs belonging to brain source activity⁴⁸. The resulting metric calculates the percentage of ICs associated with brain source activity divided by the total number of ICs found from ICA.

MRI

Temporal measures of fMRI data include median and median framewise displacement⁴⁴, root mean square of the temporal change (DVARS), and temporal signal-to-noise ratio (tSNR).

Statistical analysis

Permutation testing offers a robust framework for statistical significance assessment in EEG analysis. Multiple permutation testing was performed on the flickering checkerboard task to identify differences between the two task conditions: the rest and flickering checkerboard blocks. This analysis is used to find statistical relevance that identifies differences underlying the EEG data. One advantage of multiple permutation testing is that it does not require the same number of trials for each condition. In our permutation testing, trials were shuffled into two new groups, followed by the calculation of a paired t-test. Finally, pixel-based multiple comparisons correction was applied to reduce the familywise error rate⁴⁹.

Data privacy

All imaging data in this release has been de-identified, removing any personal identifying information (as defined by the Health Insurance Portability and Accountability) from data files, including facial features. Facial features from the T1w MRIs were removed using the “mri_deface” software package developed by Bischoff-Grethe et al.⁵⁰. Data and code are shared under the CC BY 4.0 license.

Data Records

Data access

All data can be accessed through the 1,000 Functional Connectomes Project and its International Neuroimaging Data-sharing Initiative (FCP/INDI)⁵¹. This website⁵¹ provides directions for users to directly download all the imaging data from an Amazon Simple Storage Service (S3) bucket. Raw and preprocessed data is provided through the website.

Data organization

All data have been organized following the Brain Imaging Data Structure (BIDS) format^52,53, which is an increasingly popular approach to describing imaging data in a standard format. Within the study S3 bucket, there are two folders, one containing raw data and another containing preprocessed data.

Inside the raw data folder, the BIDS standard is followed, and the base folder contains five sets of files (JSON and TSV) with information about the study. This included a description of the dataset, a list of participants including demographic information, and questionnaires regarding sleep patterns, caffeine intake, and thoughts and feeling during the experiment⁵⁴. Inside the subject and session folders, there are three folders: one containing the MRI anatomical images (anat), one containing the EEG and eye tracking data (eeg), and one file containing the fMRI and respiratory data (func). MRI data is stored in NIfTI format, EEG data is in EEGLAB data file format (.set) with header and marker files using Brainvision (.vhdr.vmrk) format, and eye tracking and respiratory data are stored in TSV’s. Sidecars (JSON files) with metadata are provided and contain information on acquisition parameters.

EEG and FMRI processed are stored in the preprocessed folder. Data is similarly organized as the raw data folder and file formats are the same. Completely preprocessed fMRI data are located in the “func_preproc” within each subject/session/func/task folder. Preprocessed EEG and eye tracking data are in the “eeg” folder within the subject/session subfolders.

Technical Validation

EEG data validation and quality

Figure 2 shows the comparison of EEG during the checkerboard experiment across the three scan settings. Figure 2A shows the signal power over 2 s epochs during the checkerboard and rest task averaged across subjects at the Oz electrode. Comparing the checkerboard versus rest condition, the checkerboard shows peaks at 12 Hz and 24 Hz, representing the driving frequency of the checkerboard and its harmonic, respectively. When moving into the scanner environment, the power of the checkerboard is reduced. Nonetheless, peaks at 12 Hz and 24 Hz are still visible in the Scanner OFF and Scanner ON settings. The Scanner ON setting also contains a dip at 18 Hz in both rest and checkerboard conditions. This dip denotes residual artifacts that are left behind in the signal after the gradient artifact removal step. When looking at the frequency over the duration of the 2 s epoch, the difference between the rest and checkerboard condition is evident with the 12 Hz driving frequency and the 24 Hz harmonic appearing across the entire epoch (Fig. 2B). As shown in Fig. 2C,these differences are statistically significant for the three settings. While these plots focus on the Oz electrode, the signal power also extends to other electrodes in the occipital region (Fig. 2D).

Three quality assessment metrics were computed for each raw EEG dataset: percent of “good” channels, percent of “good” trials, and the number of independent components (ICs) related to brain source activity as a percentage of the total number of ICs. As shown in Fig. 3, data quality was high across all subjects for the percentage of good channels and trials for the checkerboard task. Although the data quality was highest in the Outside setting, high percentages were found for channels and trials in the Scanner OFF and Scanner ON settings. Similarly, as seen in Fig. 4, the percentage of good channels and trials was high across tasks, denoting the stability of data quality during the scan session. The percent of putative brain sources based on ICs classification was lower for the Scanner ON setting compared to the other two settings. Due to the increase in noise sources in the Scanner OFF (e.g., pulse artifact) and Scanner ON (e.g., gradient artifact) settings, the percentage of ICs related to brain sources is expected to decrease. As shown in Fig. 4, the quality of the EEG data is stable across scan settings.

fMRI data quality

To assess the quality of fMRI data, median framewise displacement (FD) was measured for all scans. As shown in Fig. 5, the median FD was for every fMRI scan; scans with a value above 0.2 were considered high motion. To determine if there was an ordering effect, scan sessions were color coded to determine if participants moved earlier or later in the scan. Most subject data were below the 0.2 threshold (93% of scans), and there was no pattern of ordering across participants.

For the checkerboard experiment, we looked at the correlation between ROIs within and between subjects and across scans (Fig. 6). The distributions for within-scan and within subjects showed a broader distribution of values, with higher correlations for within-scan and within subject distributions.

Multimodal quality data correlations

Values for EEG and MRI data were compared within and between modalities across several quality metrics: mean FD (framewise displacement), median FD, DVARS (temporal derivative of time courses), and tSNR (temporal signal-to-noise ratio) for MRI; channels, trials, and brain sources for EEG (Fig. 7). Using Spearman’s ρ between each modality, shows a strong positive correlation between mean and median FD, and a strong negative correlation between FD measures and tSNR. A weak correlation was found between DVARS and tSNR, but no association was found with DVARS and other measures. For EEG measures, there was no correlation between the different quality measures. Moreover, there was no correlation between quality measures between imaging modalities.

Multimodal data integration

As a test for multimodal data integration, we evaluated whether we could use the EEG signal to predict the hemodynamic response in the fMRI data. Specifically, after preprocessing, the EEG signal from Oz signal was averaged across participants for the checkerboard experiment. This signal was then bandpass filtered (20th order IIR filter between 11 Hz and 13 Hz), modulated, and convolved with an ideal hemodynamic response function (using a gamma variate function). This signal was used as a regressor for each participant to map out the BOLD activity. A one-sample t-test was performed to calculate a group activity map (Fig. 8A). For comparison purposes, a regressor based on an ideal block design convoluted a gamma variate function was also calculated to look at the group-level activity (Fig. 8B). Activity maps shown in Fig. 8 indicate that both approaches generate a similar level of activity in the occipital lobe.

Technical challenges

Collecting EEG and fMRI simultaneously requires several methodological considerations. While EEG and fMRI have a long-established history, collecting EEG inside the MRI scanner is challenging for several technical reasons. The main problem encountered when collecting a functional recording is the generation of artifacts from various sources. The main artifact arises from the gradient artifact generated during echo-planar imaging (EPI), which induces changes in the magnetic field⁵⁵. Another source of noise arises from the scanner environment. While not a problem in all scanners, vibrations from the helium compressor in Siemens Trio and Verio scanners introduce artifacts into the EEG signal⁵⁶; these vibrations induce non-stationary artifacts that contaminate the EEG signal. Yet another source of noise is caused by the pulsation of arteries in the scalp that cause movement in EEG electrodes and generation voltage. The ballistocardiogram (BCG) signal captures the ballistic forces of blood in the cardiac cycle^4,5 and becomes more pronounced as the magnetic field strength increases⁶. In addition to a more pronounced signal, the ECG signal can impact data collection and preprocessing. In some cases, the pronounced ECG signal leads to saturation of the signal during the MRI scan sequence. Consequently, this saturation causes signal clipping that impedes QRS detection and pulse artifact removal methods during preprocessing. In this data release, there are occasions of signal clipping of the ECG channel. For participants where QRS detection of the ECG channel failed, one method used in this study was to perform QRS detection on every EEG channel and select the channel containing the mode of the detected QRS complexes. From this channel, the median template is created and applied across channels for pulse artifact removal.

To address these numerous sources of noise, there are also techniques. Gradient artifacts can be minimized by modifying the configuration or layout of EEG leads⁵⁷ or placement of the head in the coil⁵⁸. To remove the gradient field artifact, we use the MR clock to record the scanner trigger at every TR^59,60. Using a template artifact subtraction method⁶¹, the gradient artifact is recorded at each TR onset and averaged to create a template. The template is then subtracted from the signal to produce a clean signal. In this study, we used the MRIB plug-in for EEGLAB, provided by the University of Oxford Centre for Functional MRI of the Brain (FMRIB), to regress the MRI gradient artifact^37,38. For noise induced by the helium compressor, there are methods for recording and regressing this motion induced artifact^7,62; however, in our experiments, the simplest method for removing this artifact was to turn off the helium compressor during simultaneous recordings. While there is a risk of helium boiling off as the temperature rises in the scanner, this can be addressed by having shorter scan sessions. In our study, the temperature of the cooling system did not fluctuate, which would impact cryogen loss. While shorter scans are ideal, we collected data for upwards of 2 hours without issue.

Another factor found to impact EEG data quality was signal clipping, which often appears in the ECG channel during simultaneous recordings. In our simultaneous EEG-fMRI recordings, cable and amplifier placement inside the scanner affected EEG data quality. For cable placement, several factors must be taken into consideration. When scanning, researchers must ensure their setup minimizes loops, cables should run along the center of the bore, and the connected amplifier should be placed at the center of the bore to ensure better data quality⁵⁵. Excessive bends or loops in wires can induce currents in the cables, thus introducing artifacts into the EEG signal. Another way to reduce artifacts is to reduce cable length between the EEG cap and connected amplifiers. All major scanner vendors offer head coils that are designed with a channel for EEG cables that lie directly above a participant’s head⁶³. In addition, cables that are bundled produce fewer artifacts than ribboned cables⁶⁴. In our experiments, the head coil did not contain a channel for EEG cables and a ribboned cable was used to connect the cap and amplifier. To reduce artifacts, EEG cables were run through the head coil above the participant’s head and taped along the center of the bore to minimize movement and to ensure an optimal position in the scanner.

Code availability

Code for presenting task stimuli and naturalistic stimuli, along with code to preprocess EEG and fMRI imaging data, is available on GitHub (https://github.com/NathanKlineInstitute/NATVIEW_EEGFMRI). Additionally, the videos used for naturalistic stimuli will also be made available through the GitHub repository.

References

Mele, G. et al. Simultaneous EEG-fMRI for functional neurological assessment. Front. Neurol. 10, 848 (2019).
Article PubMed PubMed Central Google Scholar
Nentwich, M. et al. Functional connectivity of EEG is subject-specific, associated with phenotype, and different from fMRI. Neuroimage 218, 117001 (2020).
Article PubMed Google Scholar
Allen, P. J., Josephs, O. & Turner, R. A method for removing imaging artifact from continuous EEG recorded during functional MRI. Neuroimage 12, 230–239 (2000).
Article CAS PubMed Google Scholar
Yan, W. X., Mullinger, K. J., Geirsdottir, G. B. & Bowtell, R. Physical modeling of pulse artefact sources in simultaneous EEG/fMRI. Hum. Brain Mapp. 31, 604–620 (2010).
Article PubMed Google Scholar
Luo, Q., Huang, X. & Glover, G. H. Ballistocardiogram artifact removal with a reference layer and standard EEG cap. J. Neurosci. Methods 233, 137–149 (2014).
Article PubMed PubMed Central Google Scholar
Neuner, I., Arrubla, J., Felder, J. & Shah, N. J. Simultaneous EEG-fMRI acquisition at low, high and ultra-high magnetic fields up to 9.4 T: perspectives and challenges. Neuroimage 102(Pt 1), 71–79 (2014).
Article PubMed Google Scholar
van der Meer, J. N. et al. Carbon-wire loop based artifact correction outperforms post-processing EEG/fMRI corrections–A validation of a real-time simultaneous EEG/fMRI correction method. Neuroimage 125, 880–894 (2016).
Article PubMed Google Scholar
Hoffmann, S. & Falkenstein, M. The correction of eye blink artefacts in the EEG: a comparison of two prominent methods. PLoS One 3, e3004 (2008).
Article ADS PubMed PubMed Central Google Scholar
Power, J. D. et al. Distinctions among real and apparent respiratory motions in human fMRI data. Neuroimage 201, 116041 (2019).
Article PubMed Google Scholar
Fishell, A. K., Burns-Yocum, T. M., Bergonzi, K. M., Eggebrecht, A. T. & Culver, J. P. Mapping brain function during naturalistic viewing using high-density diffuse optical tomography. Sci. Rep. 9, 11115 (2019).
Article ADS PubMed PubMed Central Google Scholar
Naturalistic stimuli: A paradigm for multiscale functional characterization of the human brain. Current Opinion in Biomedical Engineering
Sonkusare, S., Breakspear, M. & Guo, C. Naturalistic Stimuli in Neuroscience: Critically Acclaimed. Trends Cogn. Sci. 23, 699–714 (2019).
Article PubMed Google Scholar
Nastase, S. A., Goldstein, A. & Hasson, U. Keep it real: rethinking the primacy of experimental control in cognitive neuroscience. Neuroimage 222, 117254 (2020).
Article PubMed Google Scholar
Saarimäki, H. Naturalistic stimuli in affective neuroimaging: A review. Front. Hum. Neurosci. 15, 675068 (2021).
Article PubMed PubMed Central Google Scholar
Goldberg, H., Preminger, S. & Malach, R. The emotion-action link? Naturalistic emotional stimuli preferentially activate the human dorsal visual stream. Neuroimage 84, 254–264 (2014).
Article PubMed Google Scholar
Hasson, U., Furman, O., Clark, D., Dudai, Y. & Davachi, L. Enhanced intersubject correlations during movie viewing correlate with successful episodic encoding. Neuron 57, 452–462 (2008).
Article CAS PubMed PubMed Central Google Scholar
Yang, Z. et al. Measurement reliability for individual differences in multilayer network dynamics: Cautions and considerations. Neuroimage 225, 117489 (2021).
Article PubMed Google Scholar
Afdile, M. Beyond neurocinematics: Investigating biased social perception through collaboration between neuroscience and filmmaking. Leonardo 55, 278–282 (2022).
Article Google Scholar
Vanderwal, T., Kelly, C., Eilbott, J., Mayes, L. C. & Castellanos, F. X. Inscapes: A movie paradigm to improve compliance in functional magnetic resonance imaging. Neuroimage 122, 222–232 (2015).
Article PubMed Google Scholar
Park, S. H. et al. Functional subpopulations of neurons in a macaque face patch revealed by single-unit fMRI mapping. Neuron 95, 971–981.e5 (2017).
Article CAS PubMed PubMed Central Google Scholar
Park, S. H. et al. Parallel functional subnetworks embedded in the macaque face patch system. Sci. Adv. 8, eabm2054 (2022).
Article PubMed PubMed Central Google Scholar
Mantini, D. et al. Interspecies activity correlations reveal functional correspondence between monkey and human brain areas. Nat. Methods 9, 277–282 (2012).
Article CAS PubMed PubMed Central Google Scholar
Buysse, D. J., Reynolds, C. F. 3rd, Monk, T. H., Berman, S. R. & Kupfer, D. J. The Pittsburgh Sleep Quality Index: a new instrument for psychiatric practice and research. Psychiatry Res. 28, 193–213 (1989).
Article CAS PubMed Google Scholar
SR Research Ltd. EyeLink 1000 User Manual. http://sr-research.jp/support/EyeLink%201000%20User%20Manual%201.5.0.pdf (2009).
Papakostopoulos, D. & Blackmore, S. Reviews: Human brain electrophysiology: Evoked potentials and evoked magnetic fields in science and medicine, from sentience to symbols: Readings on consciousness. Perception 22, 375–377 (1993).
Article Google Scholar
Herrmann, C. S. Human EEG responses to 1–100 Hz flicker: resonance phenomena in visual cortex and their potential correlation to cognitive phenomena. Exp. Brain Res. 137, 346–353 (2001).
Article ADS CAS PubMed Google Scholar
Sun, P. et al. BOLD signal change and contrast reversing frequency: an event-related fMRI study in human primary visual cortex. PLoS One 9, e99547 (2014).
Article ADS PubMed PubMed Central Google Scholar
Bayram, A., Karahan, E., Bilgiç, B., Ademoglu, A. & Demiralp, T. Achromatic temporal-frequency responses of human lateral geniculate nucleus and primary visual cortex. Vision Res. 127, 177–185 (2016).
Article PubMed Google Scholar
Son, J. et al. Evaluating fMRI-Based Estimation of Eye Gaze During Naturalistic Viewing. Cereb. Cortex 30, 30 (2019).
Google Scholar
Frey, J. The Present. at https://www.youtube.com/watch?v=C_nJJHaNmnY (2016).
Alexander, L. M. et al. An open resource for transdiagnostic research in pediatric mental health and learning disorders. Sci Data 4, 170181 (2017).
Article PubMed PubMed Central Google Scholar
Russ, B. E. & Leopold, D. A. Functional MRI mapping of dynamic visual features during natural viewing in the macaque. Neuroimage 109, 84–94 (2015).
Article PubMed Google Scholar
Russ, B. & Leopold, D. Russ Leopold nonhuman primate movies. https://doi.org/10.5281/zenodo.4623809 (2020).
Egan, M. K., Larsen, R., Wirsich, J., Sutton, B. P. & Sadaghiani, S. Safety and data quality of EEG recorded simultaneously with multi-band fMRI. PLoS One 16, e0238485 (2021).
Article CAS PubMed PubMed Central Google Scholar
Hanke, M. et al. A studyforrest extension, simultaneous fMRI and eye gaze recordings during prolonged natural stimulation. Sci Data 3, 160092 (2016).
Article PubMed PubMed Central Google Scholar
Delorme, A. & Makeig, S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J. Neurosci. Methods 134, 9–21 (2004).
Article PubMed Google Scholar
Iannetti, G. D. et al. Simultaneous recording of laser-evoked brain potentials and continuous, high-field functional magnetic resonance imaging in humans. Neuroimage 28, 708–719 (2005).
Article CAS PubMed Google Scholar
Niazy, R. K., Beckmann, C. F., Iannetti, G. D., Brady, J. M. & Smith, S. M. Removal of FMRI environment artifacts from EEG data using optimal basis sets. Neuroimage 28, 720–737 (2005).
Article CAS PubMed Google Scholar
Schmidt, M., Krug, J. W., Rosenheimer, M. N. & Rose, G. Filtering of ECG signals distorted by magnetic field gradients during MRI using non-linear filters and higher-order statistics. Biomed. Tech. (Berl.) 63, 395–406 (2018).
Article PubMed Google Scholar
Christov, I. I. Real time electrocardiogram QRS detection using combined adaptive threshold. Biomed. Eng. Online 3, 28 (2004).
Article PubMed PubMed Central Google Scholar
Kim, K. H., Yoon, H. W. & Park, H. W. Improved ballistocardiac artifact removal from the electroencephalogram recorded in fMRI. J. Neurosci. Methods 135, 193–203 (2004).
Article PubMed Google Scholar
Xu, T., Yang, Z., Jiang, L., Xing, X.-X. & Zuo, X.-N. A Connectome Computation System for discovery science of brain. Sci. Bull. (Beijing) 60, 86–95 (2015).
Article ADS Google Scholar
Jenkinson, M. & Smith, S. A global optimisation method for robust affine registration of brain images. Med. Image Anal. 5, 143–156 (2001).
Article CAS PubMed Google Scholar
Jenkinson, M., Bannister, P., Brady, M. & Smith, S. Improved optimization for the robust and accurate linear registration and motion correction of brain images. Neuroimage 17, 825–841 (2002).
Article PubMed Google Scholar
Schaefer, A. et al. Local-Global Parcellation of the Human Cerebral Cortex from Intrinsic Functional Connectivity MRI. Cereb. Cortex 28, 3095–3114 (2018).
Article PubMed Google Scholar
Delorme, A., et al NEMAR: An open access data, tools, and compute resource operating on NeuroElectroMagnetic data. Database 2022, (2022).
Makeig, S., Bell, A., Jung, T.-P. & Sejnowski, T. J. in Advances in Neural Information Processing Systems 8, (MIT Press, 1995).
Pion-Tonachini, L., Kreutz-Delgado, K. & Makeig, S. ICLabel: An automated electroencephalographic independent component classifier, dataset, and website. Neuroimage 198, 181–197 (2019).
Article PubMed Google Scholar
Cohen, M. X. Analyzing neural time series data: Theory and practice. (Mit Press, 2014).
Bischoff-Grethe, A. et al. A technique for the deidentification of structural brain MR images. Hum. Brain Mapp. 28, 892–903 (2007).
Article PubMed PubMed Central Google Scholar
Telesford, Q. K. et al. EEG/FMRI Naturalistic Viewing Dataset. International Neuroimaging Data Sharing Initiative. https://doi.org/10.15387/fcp_indi.retro.Nat_View (2023).
Gorgolewski, K. J. et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Scientific Data 3, 1–9 (2016).
Article Google Scholar
Pernet, C. R. et al. EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Sci Data 6, 103 (2019).
Article PubMed PubMed Central Google Scholar
Gorgolewski, K. J. et al. A correspondence between individual differences in the brain’s intrinsic functional architecture and the content and form of self-generated thoughts. PLoS One 9, e97176 (2014).
Article ADS PubMed PubMed Central Google Scholar
Ritter, P. & Villringer, A. Simultaneous EEG-fMRI. Neurosci. Biobehav. Rev. 30, 823–838 (2006).
Article PubMed Google Scholar
Nierhaus, T. et al. Internal ventilation system of MR scanners induces specific EEG artifact during simultaneous EEG-fMRI. Neuroimage 74, 70–76 (2013).
Article PubMed Google Scholar
Ferreira, J. L., Wu, Y., Besseling, R. M. H., Lamerichs, R. & Aarts, R. M. Gradient artefact correction and evaluation of the EEG recorded simultaneously with fMRI data using optimised moving-average. J. Med. Eng. 2016, 9614323 (2016).
Article PubMed PubMed Central Google Scholar
Mullinger, K. J., Yan, W. X. & Bowtell, R. Reducing the gradient artefact in simultaneous EEG-fMRI by adjusting the subject’s axial position. Neuroimage 54, 1942–1950 (2011).
Article PubMed Google Scholar
Negishi, M., Abildgaard, M., Nixon, T. & Constable, R. T. Removal of time-varying gradient artifacts from EEG data acquired during continuous fMRI. Clin. Neurophysiol. 115, 2181–2192 (2004).
Article PubMed Google Scholar
Ritter, P., Becker, R., Graefe, C. & Villringer, A. Evaluating gradient artifact correction of EEG data acquired simultaneously with fMRI. Magn. Reson. Imaging 25, 923–932 (2007).
Article PubMed Google Scholar
Hashimoto, T., Elder, C. M. & Vitek, J. L. A template subtraction method for stimulus artifact removal in high-frequency deep brain stimulation. J. Neurosci. Methods 113, 181–186 (2002).
Article PubMed Google Scholar
Abbott, D. F. et al. Constructing carbon fiber motion-detection loops for simultaneous EEGâ€“fMRI. Front. Neurol. 5, 260 (2014).
PubMed Google Scholar
Brain Products. Changes to the Brain Products standard BrainCap MR. at https://pressrelease.brainproducts.com/braincap_mr/ (2020)
Chowdhury, M. E. H., Mullinger, K. J. & Bowtell, R. Simultaneous EEG-fMRI: evaluating the effect of the cabling configuration on the gradient artefact. Phys. Med. Biol. 60, N241–50 (2015).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We would like to acknowledge Raj Sangoi and Caixia Hu for providing their technical support and expertise in developing the scanning protocol for data collection. We would also like to acknowledge Mark Higger for his contributions developing code for the EEG preprocessing pipeline. Primary support for the work is provided by the BRAIN Initiative (R01MH111439) and CONTE center (P50MH109429), Rockland Sample (R01MH124045) grants from the NIH. Data hosting is supported by AWS’s Open Data program.

Author information

Authors and Affiliations

Center for Brain Imaging and Neuromodulation, Nathan S. Kline Institute for Psychiatric Research, Orangeburg, NY, USA
Qawi K. Telesford, Eduardo Gonzalez-Moreira, Yiwen Tian, Stanley J. Colcombe, Jessica Cloud, Brian E. Russ, Arnaud Falchier, Charles E. Schroeder, Michael P. Milham & Alexandre R. Franco
Center for the Developing Brain, Child Mind Institute, New York, NY, USA
Ting Xu, Michael P. Milham & Alexandre R. Franco
Department of Psychiatry, New York University Grossman School of Medicine, New York, NY, USA
Stanley J. Colcombe & Alexandre R. Franco
Department of Biomedical Engineering, The City College of the City University of New York, New York, NY, USA
Maximilian Nentwich, Jens Madsen & Lucas C. Parra
Departments of Psychiatry and Neurology, Columbia University College of Physicians and Surgeons, New York, NY, USA
Charles E. Schroeder

Authors

Qawi K. Telesford
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo Gonzalez-Moreira
View author publications
You can also search for this author in PubMed Google Scholar
Ting Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yiwen Tian
View author publications
You can also search for this author in PubMed Google Scholar
Stanley J. Colcombe
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Cloud
View author publications
You can also search for this author in PubMed Google Scholar
Brian E. Russ
View author publications
You can also search for this author in PubMed Google Scholar
Arnaud Falchier
View author publications
You can also search for this author in PubMed Google Scholar
Maximilian Nentwich
View author publications
You can also search for this author in PubMed Google Scholar
Jens Madsen
View author publications
You can also search for this author in PubMed Google Scholar
Lucas C. Parra
View author publications
You can also search for this author in PubMed Google Scholar
Charles E. Schroeder
View author publications
You can also search for this author in PubMed Google Scholar
Michael P. Milham
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre R. Franco
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Qawi Telesford collected data, analyzed and interpreted data, developed the protocol for data collection, developed and wrote the code used in experiments, created the EEG preprocessing pipeline, and is the primary author of the manuscript. Eduardo Gonzalez-Moreira analyzed and interpreted data, helped develop the EEG preprocessing pipeline, developed the quality control metrics for EEG data, and helped write and edit the manuscript. Ting Xu analyzed and interpreted data, developed and wrote code used in the fMRI preprocessing pipeline, and helped write and edit the manuscript. Yiwen Tian developed and wrote code for transferring fMRI data from the scanner and its conversion to the Brain Imaging Data Structure (BIDS) standard. Stanley Colcombe developed the scanning protocol simultaneous recordings and helped write and edit the manuscript. Jessica Cloud collected data, helped develop the protocol for data collection, developed and wrote code used in experiments. Brian Edward Russ helped develop the protocol for data collection and helped write and edit the manuscript. Arnaud Falchier helped develop the protocol for data collection and helped write and edit the manuscript. Maximilian Nentwich analyzed and interpreted data, wrote and developed code for processing eye tracking data, provided support for technical validatio,n and helped write and edit the manuscript. Jens Madsen wrote and developed code for processing eye tracking data, provided support for technical validation, and helped write and edit the manuscript. Lucas Parra helped write and edit the manuscript. Charles Schroeder obtained funding for this project and helped write and edit the manuscript. Michael Milham obtained funding for this project and helped write and edit the manuscript. Alexandre Rosa Franco analyzed and interpreted data, lead the data sharing, and helped write and edit the manuscript.

Corresponding author

Correspondence to Alexandre R. Franco.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Telesford, Q.K., Gonzalez-Moreira, E., Xu, T. et al. An open-access dataset of naturalistic viewing using simultaneous EEG-fMRI. Sci Data 10, 554 (2023). https://doi.org/10.1038/s41597-023-02458-8

Download citation

Received: 20 March 2023
Accepted: 09 August 2023
Published: 23 August 2023
DOI: https://doi.org/10.1038/s41597-023-02458-8