Intracranial human recordings are a valuable and rare resource of information about the brain. Making such data publicly available not only helps tackle reproducibility issues in science, it helps make more use of these valuable data. This is especially true for data collected using naturalistic tasks. Here, we describe a dataset collected from a large group of human subjects while they watched a short audiovisual film. The dataset has several unique features. First, it includes a large amount of intracranial electroencephalography (iEEG) data (51 participants, age range of 5–55 years, who all performed the same task). Second, it includes functional magnetic resonance imaging (fMRI) recordings (30 participants, age range of 7–47) during the same task. Eighteen participants performed both iEEG and fMRI versions of the task, non-simultaneously. Third, the data were acquired using a rich audiovisual stimulus, for which we provide detailed speech and video annotations. This dataset can be used to study neural mechanisms of multimodal perception and language comprehension, and similarity of neural signals across brain recording modalities.
brain activity measurement
Intracranial EEG • functional magnetic resonance imaging
Short audiovisual film stimulus
Sample Characteristic - Organism
Background & Summary
We live in the world of data, and big high-quality datasets that lend themselves to modern sophisticated analyses are becoming increasingly sought-after. Following examples in other branches of science, cognitive neuroscience is adopting an ever-growing trend for open science and data sharing1,2,3. Many datasets from volunteers participating in cognitive neuroscience experiments are now becoming publicly available4,5,6,7,8,9. This is coupled with a recent trend to use more naturalistic designs in these experiments as they provide rich versatile datasets and as such lend themselves well to application of many different analyses targeting various aspects of complex cognition10,11,12. Open research practices promote data reuse, research reproducibility, scientific collaboration and novel ways to analyze the data that have not been possible before3,13 and therefore, have the potential to advance the entire field of cognitive neuroscience forward faster and more efficiently than ever before14.
Cognitive neuroscience experiments are concerned with the neural mechanisms of cognitive processes including speech, sensory perception, memory, social interactions and others. These are most often studied with popular techniques, such as functional magnetic resonance imaging (fMRI), electroencephalography (EEG) and magnetoencephalography (MEG), and current openly available datasets indeed contain data collected with these techniques4,5,6,7,8,9,15,16. Despite their great value for the field, these non-invasive techniques have a number of important limitations, such as lack of temporal resolution in fMRI17,18, lack of spatial resolution in EEG and MEG, and susceptibility to artefacts that render part of the recorded signal unusable (EEG and MEG)19,20,21. To study the highly dynamic cognitive processes in humans (speech, in particular), techniques that provide high spatial and temporal resolution, and clean neural signal are preferred.
One such technique is human intracranial electroencephalography (iEEG). IEEG data are collected from patients who participate in a relatively rare procedure for localization of the source of their epileptic seizures. For this, patients are implanted subdurally with electrode grids (electrocorticography, ECoG) and/or depth electrodes (stereo-electroencephalography, sEEG) typically for a week of clinical monitoring, during which the patients can also participate in research experiments. Direct contact with the brain tissue grants iEEG several advantages compared to non-invasive brain recording modalities, including a combination of high temporal and spatial resolution, and exceptional signal-to-noise ratio. Human iEEG research has contributed to our fundamental understanding of high-level cognition that cannot be studied in animals, such as speech22,23,24,25, semantic and conceptual representation26,27,28 and abstract thought29. In addition, iEEG research on speech has shown significant promise for the development of advanced brain-computer interfaces aimed to restore communication in paralyzed patients30,31,32,33.
The unique characteristics of human iEEG recordings make them a valuable resource of information about the brain that should be used to the most of its potential through data sharing and open collaborations. However, due to multiple factors, the data are rarely shared. First, iEEG can only be obtained in the clinical setting, and as such, it is difficult and slow to collect for research purposes. The few medical centers in the world that acquire iEEG data suffer from low patient rates (5–10 a year) that, together with variability in electrode coverage, cause long study timeframes and low sample sizes. Moreover, iEEG is sensitive medical data and many centers lack ethical protocols that allow public sharing. As a result, there has been little publicly available iEEG data so far (with a few notable exceptions34,35,36).
Our lab has been collecting iEEG data for over ten years. A few years ago we developed ethical protocols that addressed the issue of data sharing and allowed us to request patients’ consent to publish their de-identified data and allow open access thereof to the entire research community. The possibility of data sharing was further facilitated by progress our colleagues made on a new standard iEEG data format - iBIDS37 that greatly simplifies and unifies data curation and preparation for public sharing.
As a result of this work, we here present the first large multimodal iEEG-fMRI dataset from a naturalistic cognitive task38. The present dataset is unique in a number of ways. First, it contains a large amount of iEEG data (51 subjects who all performed the same task). Second, the dataset provides fMRI recordings (30 subjects) from the same task. Eighteen subjects performed the task with both recording modalities: first – with fMRI, and several days or weeks later – with iEEG. Third, the data come from naturalistic stimulation with a short audiovisual film, for which we provide rich audio and video annotations. Inclusion of data from two neural recording modalities opens up new possibilities for research on neurovascular coupling in a context of a naturalistic experiment.
The dataset we present can be used to target many theoretical, methodological and applied questions in cognitive neuroscience including language, auditory, visual and multimodal perception; study of the internal dynamics of iEEG signals; and investigation of iEEG-fMRI coupling during the same task. We believe that this work has the potential to promote open science and data-sharing in the iEEG field, and support open research practices in the cognitive neuroscience community as a whole.
All participants were admitted to the University Medical Center Utrecht for diagnostic procedures related to their medication-resistant epilepsy. They underwent intracranial electrode implantation to determine the source of seizures and test the possibility of surgical removal of the corresponding brain tissue. The tasks (movie watching and resting state) were performed by the patients either as part of clinical function mapping procedures, in which our team was involved (acquired as clinical data), or as part of their participation in scientific research done by our group (acquired as research data). In the former case, patients gave a written permission to use their clinical data for research purposes (sixteen patients). In the latter case, patients gave their written informed consent to participate in research tasks (forty-seven patients). All patients gave their consent to share their de-identified data publicly. For participants under 18 (twenty-eight patients), the informed consent was obtained from the participant’s parents and/or legal guardian. If older than 12, these participants also signed the informed consent form. The study was approved by the Medical Ethical Committee of the University Medical Center Utrecht in accordance with the Declaration of Helsinki (2013).
Data from fifty-one iEEG patients (average age is 25, standard deviation is 15, 32 females) are included in the present dataset. Basic demographic information about all participants in the dataset is shown in Table 1. Forty-six patients were implanted with subdural ECoG grids (clinical grids with 2.3 mm exposed diameter, inter-electrode distance of 10 mm, between 48 and 128 contact points). Six patients were additionally implanted with a high-density (HD) ECoG grid (with 1.3 mm exposed diameter, inter-electrode distance 3–4 mm, with 32, 64 or 128 contact points). Sixteen patients were implanted with sEEG electrodes (between 4 and 173 contact points). Most patients had perisylvian grid coverage and most had electrodes in frontal and motor cortices (Fig. 1c).
Forty-five patients were implanted with electrodes in the left hemisphere, which was also language-dominant in most cases, based on fMRI, intracarotid amobarbital test (Wada), electrical stimulation or functional transcranial Doppler sonography test (Table 1). Nine patients had electrodes in the right hemisphere, and three patients had electrodes in both hemispheres.
As part of the presurgical workup eighteen of the above fifty-one iEEG patients underwent fMRI recordings and participated in the fMRI version of the movie-watching task. In addition, there were twelve more patients who only participated in the fMRI experiment. In total, thirty participants were included (average age is 22, standard deviation is 11, 14 females). The diagram showing the overlap between the fMRI and iEEG participants is shown in Fig. 1b.
Movie-watching experiment (iEEG and fMRI)
The short movie-watching task was developed as part of the standard battery of clinical tasks for presurgical functional language mapping. Therefore, most patients performed this task based on clinical request. Remaining patients were offered the task as part of research they had agreed to participate in. Most participants watched the short movie either during the iEEG or the fMRI experiment. Some participants watched the movie during both iEEG and fMRI recordings. The fMRI experiment always preceded the iEEG recordings as fMRI data collection was part of the presurgical workup that typically took place several weeks prior to electrode implantation. In the movie-watching experiment, each patient was asked to attend to the short movie made of fragments from one of the Pippi Longstocking movies (see details below). No fixation cross was displayed in the middle of the screen or elsewhere. Instead, the participants were free to watch the film in as a naturalistic setting as possible. In the case of the fMRI experiment, the video was delivered on a screen through a scanner mirror and the audio was delivered through earphones. In the case of the iEEG experiment, the video was delivered on a computer screen (21 inches in diagonal) placed directly in the patient’s room (at approximately one meter distance to the patient’s face), and the stereo sound was delivered through speakers with the volume level adjusted for each patient.
In both cases the movie was presented using the Presentation software (Neurobehavioral Systems, Berkeley, CA) and the sound was synchronized with the neural recordings.
Resting state experiment (iEEG)
During iEEG recordings twenty-six iEEG patients participated in a three-minute resting state experiment. Some patients performed the resting state task and the movie-watching task on the same day, others did the tasks on different days. The movie-watching task was performed first if it was part of the clinical testing of functional cortical mapping. The resting state experiment was collected for research purposes only.
No resting state recordings during fMRI experiments were available.
Natural resting state data (iEEG)
Even though there was generally sufficient time to collect resting state data with iEEG patients, it was not always feasible due to many practical reasons. As a result, twenty-four patients did not participate in a separate resting state task. In order to provide some form of baseline neural activity for the these iEEG patients, we selected a 3-minute fragment of ‘natural rest’ from each of these patients’ continuous 24/7 clinical iEEG recordings. We used clinical audiovisual recordings of the room to ensure that during ‘natural rest’ patients did not speak and were not spoken to, it was quiet in the room and the patient was resting with their eyes open.
A 6.5-minute short movie, made of fragments from “Pippi on the Run” (Pårymmen med Pippi Långstrump, 1970) was edited together to form a coherent plot with limited task duration. As a clinical task designed for language mapping, the movie consisted of 13 interleaved blocks of speech and music, 30 seconds each (seven blocks of music, six blocks of speech). The movie was originally in Swedish but dubbed into Dutch. Due to copyright permissions, we cannot share the movie stimulus itself in a public repository. However, we are allowed to distribute it upon request to the corresponding author for strictly non-profit research purposes related to the present neural dataset. At the same time as part of the dataset we provide detailed annotations of the audio and video content of the movie stimulus.
Annotation of the movie soundtrack was done manually using Praat39 (http://www.praat.org). Onsets and offsets of several language features were annotated including phonemes, syllables, words, clauses and sentences. We also marked onsets and offsets of individual verbs due to their central role in the sentence structure. In addition, we annotated onsets and offsets of words spoken by each story character: Pippi, Annika, Tommi, Mom, Dad and Konrad.
Characters, scenes and higher-level concepts
Video annotations28 were obtained using the commercial deep neural network Clarify General (https://www.clarifai.com/). The network received video frames one by one and returned 20 visual concept labels that were most likely present in the frame. The network was pretrained on a large dataset using a dictionary of 5,000 unique visual concepts. The output of the visual concept recognition model was then manually corrected by removing irrelevant labels and adjusting incorrect assignments. The final list of labels consisted of 129 unique visual concepts. Most labels referred to objects present in the frame, for example, ‘house’, ‘table’, ‘animal’, ‘rock’, etc., some labels described a state or relation (such as ‘seated’, ‘equestrian’, ‘wooden’, ‘together’, ‘outdoors’, etc.) or action (such as ‘walk’, ‘travel’, ‘dance’, ‘climb’, ‘smile’, etc.).
In addition, we manually annotated presence of each story character in each frame.
Data acquisition details
IEEG data acquisition
During the experiments, iEEG data were acquired with a 128-channel recording system (Micromed, Treviso, Italy). In most cases data were sampled at the rate of 512 Hz and filtered at 0.15–134.4 Hz (38 patients). In some cases, data were sampled at the rate of 2048 Hz and filtered at 0.3–500 Hz (13 patients). An external reference electrode typically placed on the mastoid part of the temporal bone was used as signal reference. In addition to the clinical iEEG recordings, six paricipants were implanted with HD ECoG grids. In two participants, HD ECoG data were recorded at 2000 Hz (filtered at 0.3–500 Hz) with a Blackrock system (Blackrock Microsystems, https://blackrockneurotech.com/) simultaneously with the clinical channels recorded with Micromed. In four participants, their HD ECoG data were recorded also via Micromed at 512 Hz (filtered at 0.15–134.4 Hz). In three of these patients, HD ECoG data were recorded simultaneously with and in addition to the clinical iEEG channels. In one patient, only HD or only clinical electrodes could be recorded at the same time, therefore no simultaneous data for HD and clinical iEEG are available for this patient. Instead, the patient performed the task twice: once when clinical iEEG data were recorded, and another time when HD ECoG data were recorded. The resting state data of this patient were also acquired asynchronously.
Additional behavioral recordings including electrocardiogram, electromyography, electrooculogram and respiration rate were collected as part of the clinical trajectory and are available for some patients.
FMRI data acquisition
Functional images were acquired on a Philips Achieva 3 T MRI scanner using 3D-PRESTO40,41. Whole brain images were acquired with the following parameters: TR/TE = 22.5/33.2, time per volume 608 ms, FA = 10, 40 slices, FOV = 224 256 160 mm and voxel size of 4 mm.
Structural data acquisition
For most participants structural T1 images were acquired on a Philips Achieva 3 T MRI scanner using TR/TE 8.4/3.2 ms, FA = 8, 175 slices, FOV = 228 × 228 × 175 and voxel size of 1 × 1 × 1 mm. One participant had a 7 T structural scan. Twenty participants had a 3 T scan with different parameters from those described above: for example, sub-millimeter voxel size (thirteen patients), voxel size of 1 × 1 × 1.1 mm (three patients) or different number of slices (twenty patients) with an average number of slices of 215 and a standard deviation of 85.
Data processing and curation
Localization of iEEG electrodes on structural images
ECoG and sEEG electrodes were detected on each patient’s post-operative computer tomography scan and coregistered to the anatomical MRI in the native space (T1w images). ECoG electrode locations were additionally corrected for brain shift and projected onto the brain tissue42,43.
We do not provide normalized electrode positions due to the difficulties in normalization and noticeable distortions in resulting positions. Only for the purpose of visualization of the total electrode coverage across all iEEG participants (Fig. 1c), did we project individual electrode locations to Montreal Neurological Institute (MNI) space using SPM12 procedures of anatomical segmentation, normalization and image reslicing.
Identification of bad electrodes
A status (‘good’ or ‘bad’) is provided for each channel in the dataset. Channels with the ‘bad’ status also have a description that explains why the are labeled ‘bad’. Each channel was visually inspected with respect to signal outliers and artifacts. Noisy channels are marked ‘noisy’ in the description of the channel status. Some electrodes were placed on top of another electrode grid or strip during implantation. These channels are marked ‘on top’ and are not recommended for use in data analyses as they do not record directly from brain tissue.
Defacing of structural images
All structural images were defaced using either SPM12 or Fieldtrip44 toolboxes to comply with the requirements for sharing de-identified medical data.
Data validation procedures
IEEG data validation
Data were preprocessed per subject using MNE-Python45 (https://mne.tools). First, we selected channels of type “ECoG” and “sEEG”, and discarded previously identified bad channels. Then, a notch filter was applied to account for the line noise at 50 Hz and its harmonics. The data were then re-referenced using the common average signal, and band-specific neural signals were extracted using the Hilbert transform: delta (δ, 1–4 Hz), theta (θ, 5–8 Hz), alpha (α, 8–12 Hz), beta (β, 13–24 Hz) and the high-frequency band (HFB) component (60–120 Hz). Final envelopes were downsampled to 25 Hz.
As a basic check of the subjects’ response to the task we compared their neural activity during speech and music blocks. For this, we estimated an ordinary least squares fit to the HFB envelopes using the block design boxcar function. The fit and statistical analysis were performed using Python package statsmodels46 (https://www.statsmodels.org). Given a possible delay in the brain’s response to the auditory input and the possibility that this delay could vary across electrodes, we calculated the fit per electrode at all time lags within 1 second after the sound onset. The best fit across the lags was recorded together with the lag value. Significance of the fit was assessed parametrically based on the t-statistic for the block design regression weight. Here we report only positive t-statistics, which correspond to higher responses to speech and lower responses to music (for the block design predictor with zeros in music blocks and ones in speech blocks) that are significant at p < 0.01, Bonferroni corrected for the number of electrodes and lags.
In addition, per electrode we computed signed r-squared values (calculated as a Pearson correlation coefficient squared, preserving the sign of correlation) between speech and music blocks, speech blocks and task rest and speech blocks and natural rest. To reduce the number of multiple comparisons (number of electrodes × frequency bands) the analysis was only performed on the electrodes with a significant linear fit to the block design (see the analysis above). The three comparisons (speech vs music, speech vs task test and speech vs natural rest) were made separately for all extracted iEEG frequency bands. Significance of reported r-squared values was determined parametrically, reported values are significant at p < 0.05, Bonferroni corrected for the number of electrodes and frequency bands.
FMRI data validation
To assess basic data quality we first analyzed motion displacement plots (calculated using fsl_motion_outliers based on estimation of frame displacement48). We also computed the temporal signal-to-noise-ratio (tSNR) per voxel as the mean of the functional volumes over time divided by their standard deviation over time49. Nibabel50 (https://nipy.org/nibabel/) and Numpy51 (https://numpy.org/) Python libraries were used for this. To visualize average tSNR on the brain surface, we normalized individual subject’s tSNR maps and computed the average over all subjects per voxel in the MNI space. This map was projected on the standard average Freesurfer surface.
Then, subject-specific and group-level statistical analyses were performed to compare fMRI responses to speech and music. A general linear model was fitted to the fMRI data using the block design boxcar function and motion parameters as additional covariates. These analyses were performed using default parameters in FSL FEAT52,53.
Conversion to BIDS
We used in-house software to convert raw data files to the BIDS (fMRI) and iBIDS (iEEG) format. The code is available here: https://github.com/UMCU-RIBS/xelo2. Validation checks were performed using BIDS Validator (https://github.com/bids-standard/bids-validator), MNE BIDS routines (https://mne.tools/mne-bids/) and manual inspection of the (i)BIDS data.
The dataset is freely available at the https://openneuro.org/datasets/ds003688 database38. All personal identifiable information has been removed and individual MRI scans have been defaced. The order, in which subjects are presented in the dataset, has been randomized and therefore does not follow any identifiable pattern (for example, alphabetical order or order by date of the experiment).
Data are organized according to the BIDS format37,54. The root folder contains meta-information about the description of the dataset (dataset_description.json); the list of participants along with their demographic details, handedness and language-dominant hemisphere (participants.tsv); the stimulus folder (stimuli) and individual data folders per participant (for example, sub-01, Fig. 2a).
Two folders are provided under the Stimulus directory: sound and video (Fig. 2d,e), each containing annotations for the corresponding stream of the film. The sound folder contains 13 tsv files. Each file contains soundtrack annotation with respect to the feature in the name of the file. For example, sound_annotation_words.tsv is the annotation of onsets and offsets for individual words. Each file has three columns: item (for example, individual words, phonemes, etc. depending on the feature), its onset and offset in seconds. There are seven annotations for language features: phonemes, syllables, words, verbs, clauses, sentences and questions; and six annotations for individual story characters: Pippi, Annika, Tommi, Mom, Dad and Konrad.
The video folder contains 135 tsv files: 129 for individual visual concepts and 6 for individual story characters. The list of visual concepts was determined based on the visual concept recognition model, which automatically labelled each frame with visual objects and concepts it had been trained to detect (see Methods for more detail). Each file contains two columns: onsets and offsets in seconds.
Participant data folders
Each participant’s folder contains one or two directories depending on the type of brain recordings available. For patients who have both fMRI and iEEG data the folder contains two directories corresponding to (f)MRI and intracranial recording sessions respectively (for example, ses-3t1 and ses-iemu1). For patients who only have iEEG data, the folder still contains two directories, and the MRI directory only contains a structural MRI scan. For patients who only have fMRI data, there is only one directory corresponding to the (f)MRI session. Individual details of (f)MRI and intracranial data sessions can vary across participants. For example, one patient has a 7T MRI scan and therefore their (f)MRI folder is named ses-7t1.
If available, iEEG recordings are stored in the patient-specific iEEG folder (for example, ses-iemu1). The folder contains all iEEG-related information including
Locations of clinical iEEG (*acq-clinical_electrodes.tsv) and HD ECoG (*acq-HGgrid_electrodes.tsv) electrodes together with a sidecar json file that contains electrode metadata. Since both sEEG and clinical ECoG are acquired through the clinical setup, their electrode locations are stored together in the *acq-clinical* file and can be differentiated by the column ‘type’. In three HD participants their HD data were recorded through the clinical setup and therefore are part of the *acq-clinical* files. These HD ECoG electrodes can be identified based on the column ‘size’ that represents the recording surface area (mm2), and is typically ~1 mm in HD electrodes. Electrode locations are provided in the native space (coregistered with the patient’s T1w anatomical images) and are the same for both movie-watching and resting state tasks.
Rendering of electrode locations (*render_photo*.jpg) per type of electrodes (‘ecog’, ‘seeg’ and ‘HD’), in the native subject space using a cortex mesh generated either from an SPM12 (Welcome Trust Centre for Neuroimaging, University College London, https://www.fil.ion.ucl.ac.uk/spm) volumetric map or from a Freesurfer (http://surfer.nmr.mgh.harvard.edu) surface reconstruction. We visualized ECoG and sEEG electrodes separately because sEEG electrodes are best shown on transparent brain surfaces, whereas ECoG and HD ECoG electrodes are best displayed on the opaque brain. Each electrode in the rendering image is numbered by the electrode’s row index in the corresponding electrode location file (*electrodes.tsv) starting with index 1 for row one. This was done for convenience as electrode names were too long to fit on the electrode rendering, yet plotting their indices allowed for quick identification of the corresponding entries in electrode location (*electrodes.tsv) and montage (*channels.tsv) files.
Per task (‘film’ and ‘rest’) and, when available, per acquisition type (‘clinical’ and ‘HDgrid’), a file with montage of the recording channels (*channels.tsv). In rare cases the montage differs between the tasks. The file contains information about iEEG electrodes and additional recorded channels (for example, marker channel, respiration rate, electrooculography, etc.). Per channel, signal acquisition details are provided, including units of measurement, sampling frequency, channel status and others. Channel status indicates which electrodes are recommended for analyses (good) and which are not (bad).
Per task (‘film’ and ‘rest’) and, when available, per acquisition type (‘clinical’ and ‘HDgrid’), a file with experimental events (*events.tsv). For the movie-watching task (‘film’) the file contains onsets and offsets for the task start, each music and speech block and the task end. For the resting state data (‘rest’) the file contains onsets and offsets of the three-minute rest period that either came from the task logs (in the resting state task) or were manually annotated (in natural rest).
Per task (‘film’ and ‘rest’) and, when available, per acquisition type (‘clinical’ and ‘HDgrid’), main files with iEEG recordings in the BrainVision format (*ieeg.eeg, *ieeg.vmrk, *ieeg.vhdr).
All participants have a folder that corresponds to the (f)MRI recording session (for example, ses-3t1). The folder contains the anatomical MRI scan (anat directory) and functional MRI data (func directory) from the movie-watching experiment if available. (F)MRI data are provided in the NIfTI format with sidecar json files that store additional metadata. Functional images are accompanied by the *events.tsv file that contains onsets and offsets of speech and music blocks of the movie-watching task measured in seconds.
IEEG data validation
Bad iEEG channels
We made a summary of bad channels and channels recommended for analyses. Channels marked as bad are also excluded from the results presented below. These electrodes either were considered noisy based on the visual inspection of the data, or were located on top of other electrodes based on the photographs from the implantation or explantation surgeries. Only four participants had more than 10% of their intracranial electrodes marked as bad channels, whereas the median number of good channels across participants is 79 (Fig. 3a).
Response to the audiovisual movie task
Smaller subsets of the present dataset have previously been analyzed with respect to the auditory and visual processing of the movie stimulus28,55,56,57. Here, we only show some basic results regarding the overall iEEG response to the task.
First, we performed a simple regression analysis that compares HFB responses during speech and music blocks. We mapped the resulting positive t-statistics that were significant at p < 0.001 (and subsequently Bonferroni-corrected for the total number of electrodes) onto the brain surface. This map showed preference to speech blocks over music throughout the perisylvian regions bilaterally and in inferior frontal gyrus, premotor and motor cortices on the left hemisphere (Fig. 3f).
In addition, we computed mean power changes in frequency bands other than HFB and compared them across different conditions: speech, music and rest. We calculated signed r-squared values for three comparisons: speech vs music, speech vs task rest and speech vs natural rest, separately for delta, theta, alpha, beta and HFB mean power signal. The reported values are significant at p < 0.05 (Fig. 3c–e). Consistent with the literature, the average pattern of HFB response was opposite to that of the lower frequency bands (theta and alpha) in all comparisons58,59,60. One notable difference between speech vs rest and speech vs music comparisons was the stronger presence of positive r-squared values in the beta band during rest.
Resting state task and natural resting state data
To offer some form of comparison between rest data from a task and natural rest data from continuous 24/7 recordings, we report signed r-squared values for two comparisons: speech vs task rest and speech vs natural rest (Fig. 3d,e). Both plots look very similar across all frequency bands and suggest that either type of resting state can be used as a baseline or control condition for investigating speech responses. Further investigation of the similarities and differences between the two sources of resting state data is an interesting research venue that the present dataset easily lends itself to.
FMRI data validation
Analysis of motion
Based on the motion parameters obtained as part of the FSL preprocessing pipeline we calculated framewise displacement48 of each participant’s head in the scanner (Fig. 3h). The analysis showed that overall, there was little motion above one voxel size. Analysis of outliers based on motion showed that only five participants had more than 5% of their functional volumes marked as outliers (Fig. 3i). The amount of motion in these five patients may be considered excessive compared to healthy volunteers, however it is common in the clinical population61. The five participants with somewhat excessive motion had iEEG data from the same experiment and were therefore included in the dataset. Many methods for excessive motion correction exist and we refer readers to some of them in Usage Notes.
Temporal signal-to-noise ratio
Temporal signal-to-noise ratio is a measure of signal dropout and effects of noise over time. It can be used to estimate how much scanning time is necessary to detect statistical effects of varying strength in the data49. It is known that block designs provide a robust method for observing reliable activation patterns, but the movie stimulus also contained more sparse auditory and visual events. Given high enough tSNR, fMRI data can be analysed with respect to such individual events. We calculated that the mean whole-brain tSNR across participants was 66.34 ± 16.13, not much lower than the value typically reported in fMRI datasets (≈70) with healthy volunteers and a smaller voxel size7,62 (Fig. 3j). Inspection of the brain maps revealed a typical pattern with lower tSNR values for the anterior temporal lobe and the orbitofrontal cortex (Fig. 3k). Somewhat lower tSNR values were also observed dorsally on the gyri. This could be due to partial volume effects caused by the large voxel size. Large voxels recording from the brain surface can sample not only from the brain tissue, but from cerebrospinal fluid and parts of the skull, which leads to field inhomogeneities.
Response to the audiovisual movie task
We also performed a simple analysis to estimate overall participants’ response to the audiovisual movie task. For this, we fitted a general linear model per participant and estimated the group effects for the contrast comparing speech and music blocks. Group statistic shows a strong effect in brain areas typically associated with auditory and language processing including bilateral superior temporal gyrus, left inferior frontal gyrus, bilateral precentral gyrus and bilateral supplementary motor cortex (Fig. 3l–n).
The dataset can be downloaded from the open public repository38. Under the Public Domain Dedication and License, the data are freely available with no restrictions on use. Below we summarize a number of things to keep in mind when working with this dataset.
The iEEG electrode coverage is much denser in the left than the right hemisphere. This should be taken into account when interpreting results of future analyses. In some cases, it may be a good idea to confine iEEG analyses to the language-dominant hemisphere. Combining iEEG with whole-brain fMRI data may be useful when addressing inter-hemispheric differences.
The present dataset includes a few difficult cases where accurate estimation of electrode locations was tricky. A small number of patients had an earlier (i.e. before the iEEG implantation) tissue resection followed by a build-up of liquid in the resection cavity, or a pathology (for example, a tumor), which may have affected the tissue under the electrodes. In addition, in patients who only had sEEG electrodes (three patients), their CT scan (used for electrode localization) occasionally lacked the resolution for accurate separation of center of mass for every electrode.
In one participant (‘sub-44’) it was not possible to record HD and clinical ECoG data simultaneously as both types of electrodes were recorded through the Micromed system that is limited to 128 channels. Therefore, the montage needed to be changed to switch between the recordings. Movie-watching and rest data in this participant were recorded twice: first with clinical ECoG and several days later – with the HD ECoG grid. Dataset users may need to take this into account when processing iEEG data of this participant.
In one participant (‘sub-29’) there was no temporal synchronization in the resting state recordings between HD and clinical ECoG data. This was due to an error in the recording setup.
One participant (‘sub-32’) is missing resting state iEEG data. There were no 24/7 continuous recordings of this patient available from the clinic.
It has been shown that, in general, iEEG responses recorded from patients with epilepsy reflect states similar to healthy controls63, yet it is possible that some individual patient’s data can be affected by epileptic or interictal events.
In preparation of this dataset we identified bad channels in each participant’s recordings. This was done based on visual inspection, calculation of basic statistics of the signal (mean signal and its variance) and photographs from implantation or explantation surgeries. Several alternative methods have been proposed to automate the process, and we encourage users to explore them64,65.
HD ECoG and sEEG data are advised to be processed separately from clinical ECoG data. For sEEG, bipolar reference and taking into account grey and white electrodes may be preferred66,67. HD ECoG data allows to zoom in on the neural processing in one specific region (typically, sensorimotor cortex), and it is therefore best to use either local or separate common average reference for it. These recordings are also often of higher sampling rate (2000 Hz) and this can be leveraged for HFB analyses.
Physiological measures among the iEEG channels (electrocardiogram, breathing and electrooculography), if available, are a valuable source of information. Previously we used electrooculography to infer saccade data. Moreover, patterns of eye blinks have been previously related to processing of relevant detail in perceptual input68. Electrocardiogram and breathing have been shown to correlate with cognitive states during experimental stimulation69,70,71.
Resting state data are another useful source of information. They can be used as a baseline for the task data and they can also be studied on their own by exploration of the internal dynamics in the task-free neural activity.
Participants who performed both iEEG and fMRI tasks watched the short film twice: first in the MRI scanner, and later – during iEEG recordings. Typically, there were at least several days (sometimes weeks) between fMRI and iEEG sessions. Nonetheless, dataset users may need to take into account that at the time of iEEG recordings these patients had already been familiar with the film from the previous fMRI experiment.
PRESTO scans have superior temporal resolution compared to the standard echo-planar imaging (EPI) sequence41. Importantly, PRESTO is a 3D sequence and therefore no slice timing correction is required when processing these data (hence no slice timing information is provided in the BIDS files). It has also been noted that given the 3D nature of the PRESTO fMRI scans, effects of motion differ from those observed in 2D EPI41,72. Since most movements are corrected during acquisition with PRESTO, correction for motion artifacts by using motion parameters may have no beneficial effect on the data.
In four fMRI participants amount of estimated motion exceeded one voxel size (4 mm). This excessive motion is due to the fact that all fMRI data come from epilepsy patients. This is intentional as fMRI data are meant to be complementary to the iEEG recordings, and here we provide data from a considerable number of patients who did the same task with both recording modalities. However, it is known that patients are less likely to remain stationary in the scanner to the same degree healthy volunteers would. Several methods to account for this motion have been proposed and successfully used to mitigate the issue, including motion scrubbing73, Volterra expansion for general linear models61,74, independent component analysis for artifact removal75 and other despiking and denoising methods. We can recommend that users explore software that incorporates advanced motion correction methods such as fMRIprep76 and ArtRepair77 toolbox for SPM.
The soundtrack of the audiovisual film was originally in Swedish. For all our experiments with Dutch patients, many of whom were children, we used a movie version dubbed into Dutch.
Four participants (‘sub-11’, ‘sub-37’, ‘sub-49’ and ‘sub-63’) have missing information about their handedness.
Four participants (‘sub-01’, ‘sub-11’, ‘sub-30’ and ‘sub-33’) have missing information about their language-dominant hemisphere.
The code used to perform technical validation on the (i)BIDS dataset is available at https://github.com/UMCU-RIBS/ieeg-fmri-dataset-validation. We also provide a set of utility scripts to help new users get started with processing and visualizing the data (https://github.com/UMCU-RIBS/ieeg-fmri-dataset-quickstart).
Poldrack, R. A. et al. Toward open sharing of task-based fmri data: the openfmri project. Frontiers in neuroinformatics 7, 12 (2013).
Van Essen, D. C. et al. The wu-minn human connectome project: an overview. Neuroimage 80, 62–79 (2013).
Gilmore, R. O., Diaz, M. T., Wyble, B. A. & Yarkoni, T. Progress toward openness, transparency, and reproducibility in cognitive neuroscience. Annals of the New York Academy of Sciences 1396, 5–18 (2017).
Hanke, M. et al. A high-resolution 7-tesla fmri dataset from complex natural stimulation with an audio movie. Scientific data 1, 1–18 (2014).
Wakeman, D. G. & Henson, R. N. A multi-subject, multi-modal human neuroimaging dataset. Scientific data 2, 1–10 (2015).
Schoffelen, J.-M. et al. A 204-subject multimodal neuroimaging dataset to study language processing. Scientific data 6, 1–13 (2019).
di Oleggio Castello, M. V., Chauhan, V., Jiahui, G. & Gobbini, M. I. An fmri dataset in response to “the grand budapest hotel”, a socially-rich, naturalistic movie. Scientific Data 7, 1–9 (2020).
Aliko, S., Huang, J., Gheorghiu, F., Meliss, S. & Skipper, J. I. A naturalistic neuroimaging database for understanding the brain using ecological stimuli. Scientific Data 7, 1–21 (2020).
Nastase, S. A. et al. Narratives: fmri data for evaluating models of naturalistic language comprehension. bioRxiv (2020).
Vanderwal, T., Eilbott, J. & Castellanos, F. X. Movies in the magnet: Naturalistic paradigms in developmental functional neuroimaging. Developmental cognitive neuroscience 36, 100600 (2019).
Sonkusare, S., Breakspear, M. & Guo, C. Naturalistic stimuli in neuroscience: critically acclaimed. Trends in cognitive sciences 23, 699–714 (2019).
van der Meer, J. N., Breakspear, M., Chang, L. J., Sonkusare, S. & Cocchi, L. Movie viewing elicits rich and reliable brain state dynamics. Nature communications 11, 1–14 (2020).
Poline, J.-B. et al. Data sharing in neuroimaging research. Frontiers in neuroinformatics 6, 9 (2012).
Milham, M. P. et al. Assessment of the impact of shared brain imaging data on the scientific literature. Nature Communications 9, 1–7 (2018).
Stieger, J. R., Engel, S. A. & He, B. Continuous sensorimotor rhythm based brain computer interface learning in a large population. Scientific Data 8, 1–10 (2021).
Nieto, N., Peterson, V., Rufiner, H. L., Kamienkowski, J. & Spies, R. “ thinking out loud”: an open-access eeg-based bci dataset for inner speech recognition. bioRxiv (2021).
Menon, R. S. & Kim, S.-G. Spatial and temporal limits in cognitive neuroimaging with fmri. Trends in cognitive sciences 3, 207–216 (1999).
Logothetis, N. K. What we can do and what we cannot do with fmri. Nature 453, 869–878 (2008).
Nunez, P. et al. A theoretical and experimental study of high resolution eeg based on surface laplacians and cortical imaging. Electroencephalography and clinical neurophysiology 90, 40–57 (1994).
Freeman, W. J., Holmes, M. D., Burke, B. C. & Vanhatalo, S. Spatial spectra of scalp eeg and emg from awake humans. Clinical Neurophysiology 114, 1053–1068 (2003).
Muthukumaraswamy, S. High-frequency brain activity and muscle artifacts in meg/eeg: a review and recommendations. Frontiers in human neuroscience 7, 138 (2013).
Chang, E. F. et al. Categorical speech representation in human superior temporal gyrus. Nature neuroscience 13, 1428 (2010).
Honey, C. J. et al. Slow cortical dynamics and the accumulation of information over long timescales. Neuron 76, 423–434 (2012).
Bouchard, K. E., Mesgarani, N., Johnson, K. & Chang, E. F. Functional organization of human sensorimotor cortex for speech articulation. Nature 495, 327–332 (2013).
Ding, N., Melloni, L., Zhang, H., Tian, X. & Poeppel, D. Cortical tracking of hierarchical linguistic structures in connected speech. Nature neuroscience 19, 158–164 (2016).
Wang, W., Degenhart, A. D., Sudre, G. P., Pomerleau, D. A. & Tyler-Kabara, E. C. Decoding semantic information from human electrocorticographic (ecog) signals. In 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 6294–6298 (IEEE, 2011).
Weidemann, C. T. et al. Neural activity reveals interactions between episodic and semantic memory systems during retrieval. Journal of Experimental Psychology: General 148, 1 (2019).
Berezutskaya, J. et al. Cortical network responses map onto data-driven features that capture visual semantics of movie fragments. Scientific reports 10, 1–21 (2020).
Derix, J. et al. From speech to thought: the neuronal basis of cognitive units in non-experimental, real-life communication investigated using ecog. Frontiers in human neuroscience 8, 383 (2014).
Iljina, O. et al. Neurolinguistic and machine-learning perspectives on direct speech bcis for restoration of naturalistic communication. Brain-Computer Interfaces 4, 186–199 (2017).
Martin, S., Millán, Jd. R., Knight, R. T. & Pasley, B. N. The use of intracranial recordings to decode human language: Challenges and opportunities. Brain and language 193, 73–83 (2019).
Rabbani, Q., Milsap, G. & Crone, N. E. The potential for a speech brain–computer interface using chronic electrocorticography. Neurotherapeutics 16, 144–165 (2019).
Herff, C., Krusienski, D. J. & Kubben, P. The potential of stereotactic-eeg for brain-computer interfaces: current progress and future directions. Frontiers in neuroscience 14, 123 (2020).
Miller, K. J. A library of human electrocorticographic data and analyses. Nature human behaviour 3, 1225–1235 (2019).
Fedele, T. et al. Dataset of neurons and intracranial eeg from human amygdala during aversive dynamic visual stimulation. OpenNeuro https://doi.org/10.18112/openneuro.ds003374.v1.1.1 (2020).
Li, A. et al. Epilepsy-ieeg-multicenter-dataset. OpenNeuro https://doi.org/10.18112/openneuro.ds003029.v1.0.2 (2020).
Holdgraf, C. et al. ieeg-bids, extending the brain imaging data structure specification to human intracranial electrophysiology. Scientific data 6, 1–6 (2019).
Berezutskaya, J. et al. Open multimodal ieeg-fmri dataset from naturalistic stimulation with a short audiovisual film. OpenNeuro https://doi.org/10.18112/openneuro.ds003688.v1.0.6 (2021).
Boersma, P. & Weenink, D. Praat: doing phonetics by computer [computer program]. version 6.0. 37. Retrieved February 3, 2018 (2018).
Van Gelderen, P. et al. Three-dimensional functional magnetic resonance imaging of human brain on a clinical 1.5-t scanner. Proceedings of the National Academy of Sciences 92, 6906–6910 (1995).
Neggers, S. F., Hermans, E. J. & Ramsey, N. F. Enhanced sensitivity with fast three-dimensional blood-oxygen-level-dependent functional mri: comparison of sense–presto and 2d-epi at 3 t. NMR in Biomedicine: An International Journal Devoted to the Development and Application of Magnetic Resonance In vivo 21, 663–676 (2008).
Hermes, D., Miller, K. J., Noordmans, H. J., Vansteensel, M. J. & Ramsey, N. F. Automated electrocorticographic electrode localization on individually rendered brain surfaces. Journal of neuroscience methods 185, 293–298 (2010).
Branco, M. P. et al. Alice: A tool for automatic localization of intra-cranial electrodes for clinical and high-density grids. Journal of neuroscience methods 301, 43–51 (2018).
Oostenveld, R., Fries, P., Maris, E. & Schoffelen, J.-M. Fieldtrip: open source software for advanced analysis of meg, eeg, and invasive electrophysiological data. Computational intelligence and neuroscience 2011 (2011).
Gramfort, A. et al. MEG and EEG data analysis with MNE-Python. Frontiers in Neuroscience 7, 1–13 (2013).
Seabold, S. & Perktold, J. statsmodels: Econometric and statistical modeling with python. In 9th Python in Science Conference (2010).
Smith, S. M. et al. Advances in functional and structural mr image analysis and implementation as fsl. Neuroimage 23, S208–S219 (2004).
Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L. & Petersen, S. E. Spurious but systematic correlations in functional connectivity mri networks arise from subject motion. Neuroimage 59, 2142–2154 (2012).
Murphy, K., Bodurka, J. & Bandettini, P. A. How long to scan? the relationship between fmri temporal signal to noise ratio and necessary scan duration. Neuroimage 34, 565–574 (2007).
Brett, M. et al. nipy/nibabel: 3.2.1. Zenodo https://doi.org/10.5281/zenodo.4295521 (2020).
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362, https://doi.org/10.1038/s41586-020-2649-2 (2020).
Woolrich, M. W., Ripley, B. D., Brady, M. & Smith, S. M. Temporal autocorrelation in univariate linear modeling of fmri data. Neuroimage 14, 1370–1386 (2001).
Woolrich, M. W., Behrens, T. E., Beckmann, C. F., Jenkinson, M. & Smith, S. M. Multilevel linear modelling for fmri group analysis using bayesian inference. Neuroimage 21, 1732–1747 (2004).
Gorgolewski, K. J. et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Scientific data 3, 1–9 (2016).
Berezutskaya, J., Freudenburg, Z. V., Güçlü, U., van Gerven, M. A. & Ramsey, N. F. Neural tuning to low-level features of speech throughout the perisylvian cortex. Journal of Neuroscience 37, 7906–7920 (2017).
Berezutskaya, J. et al. Modeling brain responses to perceived speech with lstm networks. In Benelearn, 149–153 (2017).
Berezutskaya, J., Freudenburg, Z. V., Güçlü, U., van Gerven, M. A. & Ramsey, N. F. Brain-optimized extraction of complex sound features that drive continuous auditory perception. PLoS computational biology 16, e1007992 (2020).
Crone, N. E. et al. Functional mapping of human sensorimotor cortex with electrocorticographic spectral analysis. i. alpha and beta event-related desynchronization. Brain: a journal of neurology 121, 2271–2299 (1998).
Crone, N. E., Miglioretti, D. L., Gordon, B. & Lesser, R. P. Functional mapping of human sensorimotor cortex with electrocorticographic spectral analysis. ii. event-related synchronization in the gamma band. Brain: a journal of neurology 121, 2301–2315 (1998).
Hermes, D. et al. Cortical theta wanes for language. Neuroimage 85, 738–748 (2014).
Lemieux, L., Salek-Haddadi, A., Lund, T. E., Laufs, H. & Carmichael, D. Modelling large motion events in fmri studies of patients with epilepsy. Magnetic resonance imaging 25, 894–901 (2007).
Sengupta, A. et al. A studyforrest extension, retinotopic mapping and localization of higher visual areas. Scientific data 3, 1–14 (2016).
Zhang, S. et al. Dynamic analysis on simultaneous ieeg-meg data via hidden markov model. medRxiv (2020).
Tuyisenge, V. et al. Automatic bad channel detection in intracranial electroencephalographic recordings using ensemble machine learning. Clinical Neurophysiology 129, 548–554 (2018).
Li, M. et al. Automatic bad channel detection in implantable brain-computer interfaces using multimodal features based on local field potentials and spike signals. Computers in biology and medicine 116, 103572 (2020).
Mercier, M. R. et al. Evaluation of cortical local field potential diffusion in stereotactic electro-encephalography recordings: a glimpse on white matter signal. Neuroimage 147, 219–232 (2017).
Li, G. et al. Optimal referencing for stereo-electroencephalographic (seeg) recordings. NeuroImage 183, 327–335 (2018).
Murali, S. & Haendel, B. The latency of spontaneous eye blinks marks relevant visual and auditory information processing. bioRxiv (2020).
Kern, M., Aertsen, A., Schulze-Bonhage, A. & Ball, T. Heart cycle-related effects on event-related potentials, spectral power changes, and connectivity patterns in the human ecog. Neuroimage 81, 178–190 (2013).
Tort, A. B., Hammer, M., Zhang, J., Brankačk, J. & Draguhn, A. Causal relations between cortical network oscillations and breathing frequency. bioRxiv (2020).
So, T. Y., Li, M. Y. E. & Lau, H. Between-subject correlation of heart rate variability predicts movie preferences. PloS one 16, e0247625 (2021).
van Gelderen, P., Duyn, J., Ramsey, N., Liu, G. & Moonen, C. The presto technique for fmri. NeuroImage 62, 676–681 (2012).
Power, J. D. et al. Methods to detect, characterize, and remove motion artifact in resting state fmri. Neuroimage 84, 320–341 (2014).
Friston, K. J., Williams, S., Howard, R., Frackowiak, R. S. & Turner, R. Movement-related effects in fmri time-series. Magnetic resonance in medicine 35, 346–355 (1996).
Pruim, R. H. et al. Ica-aroma: A robust ica-based strategy for removing motion artifacts from fmri data. Neuroimage 112, 267–277 (2015).
Esteban, O. et al. fmriprep: a robust preprocessing pipeline for functional mri. Nature methods 16, 111–116 (2019).
Mazaika, P., Whitfield-Gabrieli, S., Reiss, A. & Glover, G. Artifact repair for fmri data from high motion clinical subjects. Human Brain Mapping 47, 70238–1 (2007).
This work was supported by the European Research Council (Advanced iConnect Project Grant ADV 320708) and the Netherlands Organisation for Scientific Research (Language in Interaction Project Gravitation Grant 024.001.006). We thank Frans Leijten, Cyrille Ferrier, Geert-Jan Huiskamp, Sandra van der Salm and Tineke Gebbink for help with collecting data; Peter Gosselaar and Peter van Rijen for implanting the electrodes; the technicians and staff of the clinical neurophysiology department and the patients for their time and effort; Jan Linnebank for editing the short film and the members of the UMC Utrecht ECoG research team for data collection. We also thank the Swedish Film Institute film company for their help and the provided materials.
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Berezutskaya, J., Vansteensel, M.J., Aarnoutse, E.J. et al. Open multimodal iEEG-fMRI dataset from naturalistic stimulation with a short audiovisual film. Sci Data 9, 91 (2022). https://doi.org/10.1038/s41597-022-01173-0
This article is cited by
Bimodal electroencephalography-functional magnetic resonance imaging dataset for inner-speech recognition
Scientific Data (2023)
Scientific Data (2023)
Scientific Data (2022)