A studyforrest extension, MEG recordings while watching the audio-visual movie “Forrest Gump”

Naturalistic stimuli, such as movies, are being increasingly used to map brain function because of their high ecological validity. The pioneering studyforrest and other naturalistic neuroimaging projects have provided free access to multiple movie-watching functional magnetic resonance imaging (fMRI) datasets to prompt the community for naturalistic experimental paradigms. However, sluggish blood-oxygenation-level-dependent fMRI signals are incapable of resolving neuronal activity with the temporal resolution at which it unfolds. Instead, magnetoencephalography (MEG) measures changes in the magnetic field produced by neuronal activity and is able to capture rich dynamics of the brain at the millisecond level while watching naturalistic movies. Herein, we present the first public prolonged MEG dataset collected from 11 participants while watching the 2 h long audio-visual movie “Forrest Gump”. Minimally preprocessed data was also provided to facilitate the use of the dataset. As a studyforrest extension, we envision that this dataset, together with fMRI data from the studyforrest project, will serve as a foundation for exploring the neural dynamics of various cognitive functions in real-world contexts.


Background & Summary
The mechanisms of human brain function in complex dynamic environments is the ultimate mystery that cognitive neuroscience aspires to quest. Most of the existing models on brain function have been obtained from tightly controlled experimental manipulations on carefully designed "artificial" stimuli. However, it remains unclear whether responses to these artificial stimuli can be generalized to ecological scenarios encountered in real-world environments, in terms of quantity, complexity, modality, and dynamics. To address these issues, naturalistic stimuli that encode a wealth of real-life contents have become increasingly popular for understanding brain function in ecological contexts. Researchers have achieved significant advances in the areas of human memory, attention, language, emotions, and social cognition using naturalistic stimuli (for recent reviews, please refer to Sonkusare et al. 1 and Jääskeläinen et al. 2 ). Simultaneously, emerging deep learning technologies that could afford multiple levels of representations for naturalistic stimuli are continuously expanding the application of naturalistic stimuli for exploring human brain function [3][4][5][6] .
Notably, owing to their dynamics and multimodal contents, movies have been successfully utilized as naturalistic stimuli to examine the mechanism by which the brain processes diverse psychological constructs and dynamic interactions. Functional magnetic resonance imaging (fMRI) is commonly employed to measure brain activity while watching a movie. In particular, the pioneering studyforrest and other naturalistic neuroimaging projects have released multiple fMRI datasets, collected from participants who had watched movie clips [7][8][9][10][11] . However, fMRI measures the relatively sluggish blood-oxygenation-level-dependent signal, therefore falling short of characterizing the complex neural dynamics underlying the cognitive processing of dynamic movies. In contrast, magnetoencephalography (MEG) measures the magnetic fields generated by neuronal activity on a millisecond time scale. Thus, MEG has great potential to pry open neural dynamics in processing naturalistic stimuli. Several studies have leveraged MEG to investigate brain activity for naturalistic movie stimuli in a short period (≤20 min) [12][13][14][15][16] . However, there is still a dearth of publicly accessible MEG recordings for naturalistic stimuli, especially prolonged MEG recordings for dynamic movies that are more likely to capture the temporal dynamics of regular functional brain states that occur in everyday life, and contribute to unraveling human brain function in ecological contexts.
Herein, we present an MEG dataset obtained while watching the 2 h long audio-visual movie "Forrest Gump" (R. Zemeckis, Paramount Pictures, 1994). The recordings measure brain activation with a temporal resolution at the millisecond level, thus providing a timely and efficient extension to the studyforrest dataset. Specifically, MEG data were collected from 11 participants while they were watching the Chinese-dubbed movie "Forrest Gump" in eight consecutive runs, each lasting for roughly 15 min. High-resolution structural MRI was additionally acquired for all participants, allowing the incorporation of the detailed anatomy of the brain and head in the source localization of MEG signals. Together with the raw data, preprocessed MEG and MRI data with standard pipelines were also provided to facilitate the use of the data. Considering MEG and fMRI are complementary to each other, synergy between our present MEG recordings and fMRI data from the studyforrest project will provide a valuable resource to study brain function in the real-life contexts. We believe the dataset is suitable for addressing many questions pertaining to the neural dynamics of various aspects, including perception, memory, language, and social cognition.

participants.
A total of 11 participants (mean ± SD age: 22 ± 1.7 years, 6 female participants) from the Beijing Normal University, Beijing, China, volunteered for this study. They completed both the MEG and MRI sessions. All participants were right-handed, native Chinese speakers, with normal hearing and normal or corrected-to-normal vision. None of them had ever watched the film "Forrest Gump" before, except one who had watched some clips, however not the entire movie. Of the 10 participants, four had heard about the movie plot, while others did not. The study was approved by the Institutional Review Board of the Faculty of Psychology, Beijing Normal University. Written informed consent was obtained from all participants, prior to their participation. All participants provided additional consent for sharing their anonymized data for research purposes.
procedures. Figure 1 depicts the overall flow of data collection and preprocessing. Prior to data acquisition, all participants completed a questionnaire on their demographic information and familiarity with the movie "Forrest Gump". The data acquisition consisted of two sessions for each participant, namely, one MEG session to record their neural activities during movie watching and an MRI session with a T1-weighted (T1w) scan to measure the brain structure for the spatial localization of the MEG signal. The MRI scan immediately followed the MEG session for all participants, except for sub-07 and sub-11, who finished their MRI session a week later.
Stimulus material and presentation. The audio-visual stimuli were generated from the Chinese-dubbed "Forrest Gump" DVD, released in 2013 (ISBN: 978-7-7991-3934-0). The movie was split into eight segments, each of which lasted for approximately 15 min. The stimuli were initially obtained by concatenating all original VOB files from the DVD release into one MPEG-4 file, using FFmpeg (https://ffmpeg.org). The concatenated MPEG-4 file contained a video stream and a Chinese-dubbed audio stream, which was down-mixed from multi-channel Schematic of the data collection and preprocessing procedure.Data collection comprised of one MEG session followed by one MRI session. The neuromagnetic signals were recorded with a whole-scalp-covering MEG while the participants watched the audio-visual movie "Forrest Gump". An anatomical T1w imaging was acquired in the MRI session. The raw MEG data and MRI data were preprocessed with MNE and fMRIPrep toolbox, respectively. The MEG-MRI coregistration was performed on the preprocessed data.
www.nature.com/scientificdata www.nature.com/scientificdata/ to 2Channel stereo. The stimuli were then divided into eight segments using Adobe Premier software (Adobe Premiere Pro CC 2017, Adobe, Inc., San Jose, CA, USA). Each segment conformed to the following specifications: video codec = avc1, display aspect ratio = 4:3, resolution = 1024 × 768 pixels, frame rate = 25 FPS, color space = YUV, video bit depth = 8 bits, audio codec = mp4a-40-2, audio sampling rate = 48.0 kHz, and audio channels = 2. Each successive segment began with a 4 s repetition of the end of the previous segment. It should be noted that the Chinese-dubbed "Forrest Gump" is an abridged edition of the German version; some segments in the German version are not included in the Chinese version. To align with the stimuli of the studyforrest dataset as much as possible, a short clip from the German-dubbed DVD released in 2011 (EAN: 4010884250916) was added to our stimuli. Table 1 summarizes the sources of the MEG stimuli from both the Chinese and German versions. Figure 2 characterizes the alignment between the MEG stimuli (current study) and fMRI stimuli (studyforrest project) for each run.
The detailed configuration of the MEG system is illustrated in Fig. 3. The movie stimuli were presented using Psychophysics Toolbox Version 3 17 in MATLAB 2016 (MathWorks, Natick, MA, USA), which drives the PC hardware directly to generated stimuli. The audio stimuli were generated with a standard PCI sound card (Sound Blaster Audigy 5/Rx) and then amplified with an insert earphone. The insert earphone comprised of a small stimulator (E-A-RTONE 3 A; 3 M, Minnesota, USA), a thin plastic tube (Tygon B-44-4x tubing, length: 1.88 m, I.D.:1/8, O.D.: 3/16; Saint Gobain Tygon division), and an earplug (ER3-14A; Etymotic Research, Inc., Elk Grove Village, IL, United States). The visual stimuli were generated with a standard PCI graphics card (GeForce GT620; NVIDIA, Santa Clara, California, USA) and projected onto a screen in full-screen mode via a DLP projector (NP63 + ; NEC, Tokyo, Japan) with 1024 × 768-pixel resolution. The participants watched the visual stimuli on a rear projection screen through mirror reflection (visual field angles = 31.17° × 23.69°; viewing distance = 751 mm). The participants were instructed to watch the movie, without performing any other tasks and to keep still as best as possible.
The average latencies between the stimulus initiation and the actual receipt of the stimulus was 33 ms and 15 ms for the visual and audio stimuli, respectively. The latencies were measured independently from the MEG movie experiment. A standard pulse stimulus (visual: white patch, audio: pure tone) was requested to be presented by Psychophysics Toolbox, while the corresponding onset trigger was sent to the trigger channel. A photo (visual) / acoustic (audio) sensor connected to one of the MEG Analog-to-Digital Converter (ADC) channels, placed near the MEG helmet, was used to detect the stimuli. The latencies were then calculated as the difference between the recorded time of the ADC channel and the trigger channel. The latencies were measured multiple times and the average latencies were calculated for visual and audio stimulation respectively. MEG data acquisition. MEG data were recorded using a 275-channel whole-head axial gradiometer DSQ-3500 MEG system (CTF MEG, Canada) at the Institute of Biophysics, Chinese Academy of Sciences, Beijing, China. Three channels (i.e., MLF55, MRT23 and MRT16) were out of service due to failure of sensors. The neuromagnetic signals were recorded in continuous mode at a sampling rate of 600 Hz, without online digital band filters. A third-order synthetic gradiometer was employed to remove far-field noise. The precise timing of each frame was recorded. After each frame was requested to be presented to the participants, we recorded a trigger pulse lasting for five samples in the stimulus channel UPPT001. The beginning of the movie was indicated with www.nature.com/scientificdata www.nature.com/scientificdata/ a value of 255. Owing to the limited bit-width of the stimulus channel, the frame number could not be marked with an accurate value >20,000. The frame numbers were therefore marked as the ceiling of the timestamp of that frame divided by 10, resulting in step-like increasing marker values starting from 1 with a step width of 250 (25 fps * 10 s).
At the beginning of each session, three HPI coils were attached to the participants' nasion (NAS), left preauricular (LPA), and right preauricular (RPA) points to continuously measure their head position in the MEG helmet. A customized wooden chin-rest supporter was introduced to prevent possible head movements. The MEG session consisted of eight runs, with each run playing one movie segment. Eight segments were played chronologically. The participants took a self-paced break between runs. Following the completion of the MEG scan, the participants underwent an anatomical T1w scan. The HPI coils were replaced with three customized MRI-compatible vitamin E capsules in the MRI scan to provide spatial reference for the spatial alignment between the MEG and MRI data.  www.nature.com/scientificdata www.nature.com/scientificdata/ MEG data preprocessing. MEG data processing was performed offline using the MNE-Python v0.22 package 18 . The MEG preprocessing pipeline was conducted at the run level (Fig. 1). First, the bad channels were detected and marked. As a result, no bad channels were identified in all acquisitions except two in run-05 of sub-05. Second, a high-pass filter of 1 Hz was applied to remove possible slow drifts from the continuous MEG data. Finally, artifact removal was performed using an independent component analysis (ICA) with FastICA wrapped in the MNE. Artifact-ICs were classified mainly using the spatial topography and time course information following the guide provided by Bishop & Busch 19 . The number of independent components (IC) was set to 20. Two raters (i.e., X.L. and Y.D.) manually identified the head movement, eye movement, eye blinks, and cardiac artifacts (Fig. 4b). On average, 3.21 ICs (SD: 0.85) were classified as artifacts. The denoised MEG data were eventually reconstructed from all the non-artifact components and residual components (Fig. 4c). Both the raw and preprocessed data were provided in the released dataset.
The raw DICOM files of T1w images were converted to NIFTI files using dcm2niix (https://github.com/rordenlab/dcm2niix). The T1w images were then minimally preprocessed using the anatomical preprocessing pipeline from fMRIPrep v20.2.1, with default settings 20 . In brief, the T1w data were skull-stripped and corrected for intensity nonuniformity with ANTs and N4ITK 21 . Brain surfaces were reconstructed using FreeSurfer 22 . Spatial normalization to both MNI152NLin6Asym and MNI152NLin2009cAsym was performed through nonlinear registration with ANTs, using the brain-extracted versions of both T1w volume and template.

MEG-MRI coregistration procedure.
To reconstruct the source of MEG sensor signals, MEG data were co-registered with the high-resolution anatomical T1w MRI data for each participant. The NAS, LPA, and RPA points marked in both MEG and MRI sessions were used as fiducial points for the alignment of the MEG and MRI data. Specifically, following the generation of a high-resolution head surface using MNE make_scalp_surfaces www.nature.com/scientificdata www.nature.com/scientificdata/ based on FreeSurfer reconstruction, we performed MEG-MRI coregistration for each participant in the MNE COREG GUI 18 . First, the three fiducial points were manually pinned on the MRI-reconstructed head surface. An iterative algorithm (nearest-neighbor calculations) was then run to align the MEG and MRI coordinates. The co-registration was refined by manual adjustment. The results showed that the average distances between the three fiducials in the coregistered MEG and MRI coordinate systems were 0.96 mm, 4.22 mm, and 4.90 mm for NAS, LPA, and RPA, respectively. Both the MRI-fiducials files and MEG-MRI coordinate transformation files were included in the released data.
Raw data. The raw data of each participant were stored separately in the "sub-<participant_id>" folders ( Fig. 5b), consisting of two subfolders, namely "anat" and "meg". The T1w MRI data ("*T1w.nii.gz") and associated sidecar json files were located in the "anat" folders. The raw MEG data were provided as CTF ds files ("*_meg.ds") for each run, and located in the "meg" folder along with sidecar json files. In addition, "*_channel. tsv" files with MEG channel information, "*_events.tsv" files with the presentation timing of stimuli frames, and "*coordsystem.json" files with coordinate system information of the MEG sensors were included in the "meg" folder.

technical Validation
We assessed the data quality of both the raw and preprocessed data using four measures as follows: head motion magnitude, stimuli-induced time-frequency characteristics, homotopic functional connectivity (FC), and inter-subject correlation (ISC).

Motion magnitude distribution.
Head movements during MEG scans are one of the significant factors that degrade both sensor-and source-level analyses. Herein, we calculated the motion magnitude for each sample as the Euclidian distance between the current and the initial head position while the movie segment began playing. The head motion across all runs and all participants were summarized for each of the three fiducials (NAS, LPA, and RPA) to provide an overview of the head movement of the dataset. As shown in Fig. 6, motion magnitude of 95% of the samples had head motions lower than 3.43 mm, 4.11 mm, and 3.87 mm for NAS, LPA, and RPA, respectively. Furthermore, 50% of the samples had head motions smaller than 0.99 mm, 1.10 mm, and 1.46 mm for NAS, LPA, and RPA, respectively. These findings indicated low head motion magnitude on average. The head motion magnitudes of each participant are presented in Supplementary Fig. 1. time-frequency characterization of brain activity. Next, we validated whether MEG recordings could successfully detect the change in stimuli-induced brain activity. Because the movie stimuli do not have explicit condition structures as in conventional design, we selected two exemplar movie clips, within which the audio or visual features showed pronounced changes to examine if the expected change in MEG signals could be detected at the related sensors. In one clip (Seg 3: frame 15864 ± 125 [10:35 ± 5 sec], the scene where Gump was marching during the Vietnam War), the audio features changed significantly (a vocal voice developed from background music), whereas the visual features were stable. In contrast, the other clip (Seg 1: frame 21768 ± 125 [14:30 ± 5 sec], the scene where young Gump and Jenny were sitting on the tree waiting for stars), comprised of stable audio features, whereas the visual features changed from landscape to human figures. Validation was performed according to the following procedure 26-28 : first, time-frequency analysis with Morlet wavelets was conducted for each sensor in the occipital and temporal lobes. The baseline was set to 1000 ms before the change points of the audio or visual features, and the baseline mean was subtracted for each channel. Second, the time-frequency representations were averaged across the participants. Finally, the time-frequency representations were averaged across the sensors from the occipital and temporal lobes. As shown in Fig. 7, the time-frequency representations from the occipital sensors, and not the temporal sensors, were locked with changes in visual features. The opposite pattern was observed for the audio feature changes in the stimuli. The results demonstrated that current MEG data could accurately detect stimulus-induced brain activity.
Homotopic functional connectivity. A basic principle of the brain's functional architecture is that FC between inter-hemispheric homologs (i.e., homotopic regions) is particularly stronger than other interhemispheric (i.e., heterotopic) FCs 29,30 . Herein, we tested if MEG data for dynamic movies at the sensor level could reveal strong homotopic FC. First, the absolute envelope amplitude of the MEG signal for each sensor (i.e., channel) was calculated using the Hilbert transform and then down-sampled to 1 Hz. Second, the homotopic FC between a certain sensor and its homotopic sensor was calculated as the Pearson correlation coefficient. For comparison, the heterotopic FC was also calculated for each sensor as the average correlation between it and all www.nature.com/scientificdata www.nature.com/scientificdata/ its heterotopic sensors. Finally, the homotopic and heterotopic FC values were averaged across all runs and all participants. The homotopic FCs were generally stronger than the average heterotopic FCs (Fig. 8a). Importantly, the high homotopic FC primarily appeared at the sensors located in the occipital and temporal cortices, thereby indicating strong couplings driven by the movie stimuli (Fig. 8b).
Inter-subject correlation. ISC analysis uses the brain responses of a subject to naturalistic stimuli as a model to predict the brain responses of other subjects 31 . Numerous studies have demonstrated that visual and auditory cortices display significant ISC while watching audio-visual movies. We validated if a high ISC could be detected in our MEG data. For simplicity, the ISC analysis was conducted at the sensor-level. The MEG recordings captured neural oscillations at different frequency bands 32 . Therefore, the ISC was calculated in five bands (delta: 1-4 Hz, theta: 4-8 Hz, alpha: 8-13 Hz, beta: 13-30 Hz, and gamma: 30-100 Hz). First, the MEG signal was filtered for each band. Second, the absolute envelope amplitude of each band was calculated via the Hilbert transform and down-sampled to 1 Hz. Third, for each frequency band, a leave-one-participant-out ISC was calculated for the left participant as the temporal correlation between the envelope amplitude from the participant and the average of other participants. Finally, the mean ISC was calculated by averaging the ISC across all participants. As   Fig. 7 Time-frequency characterization of the brain activity for stimuli. MEG signals from two movie clips with pronounced changes in either audio or visual features of the stimuli have been examined. The time-frequency (TF) spectrum is shown for each condition with a unit of log ratio between the TF spectrum calculated from the signal of interest and that from the baseline signal (1000 ms before the onset). The occipital sensors (top) display significant signal changes with a change in the visual features of the stimuli. The temporal sensors (bottom) display significant changes with a change in the audio features. www.nature.com/scientificdata www.nature.com/scientificdata/ shown in Fig. 9, sensors with higher ISC were located near the visual and audio cortices in the preprocessed data, which reportedly displayed high ISC during movie watching in previous studies 14,[33][34][35] . In particular, the high ISC predominantly occurred in the delta, theta, and alpha bands, consistent with previous studies 14,34,35 . Together, the dataset demonstrated good validity in detecting ISC.
In addition, the high ISC around the orbital area in the raw data implies a synchronized eye movement across participants, which may have been generated by the participants' high engagement with the movie 36 . These similar eye movement patterns produced similar electrooculography artifacts, which in turn caused high ISC around the orbital area in the raw data. The fact that the high ISC around the orbital area is only observed in the raw data but not in the ICA denoised data testified this speculation. Therefore, these results also reflect a good validity of our preprocessing protocol.

Usage Notes
We presented the first public MEG dataset for a full-length movie. The MEG signals were recorded while the participants watched the 2 h long Chinese-dubbed audio-visual movie "Forrest Gump". The dataset provided a versatile resource for studying information processing in real-life contexts. First, MEG data could be independently used to study the neural dynamics of sensory processing and higher-level cognitive functions under real-life conditions. Second, as a studyforrest extension, the dataset could be integrated with publicly available fMRI data from the studyforrest project. The fusion of fMRI and MEG may shed new light on the relationship between spatially localized networks observed in fMRI and the MEG-derived temporal dynamics. Moreover, our massive MEG recordings enable the direct training of deep neural networks (DNNs) with neural activity patterns. In contrast to the DNNs that were usually trained with stimuli without referring to any neural representation, this kind of brain-constrained DNNs would act more like the human brain and generalize well across many tasks 37,38 .
Moreover, our dataset is also compatible with multiple stimulus annotations provided by the existing studyforrest project [39][40][41] . Because the visual stimuli for the MEG and fMRI datasets can be precisely aligned via the technique demonstrated in Fig. 2, the various non-speech related visual stimulus annotations from the German-dubbed fMRI stimuli in the studyforrest project can be used to complement the MEG data to explore the spatiotemporal dynamics underlying cognitive processing of annotated features. Besides, although the MEG and fMRI auditory stimuli cannot be strictly aligned at the phoneme level due to the language differences, annotations describing semantic features (e.g. semantic conflict) for fMRI stimuli should also work for the MEG dataset.
Despite the importance of the aforementioned dataset as an extension of the studyforrest dataset in studying the spatiotemporal dynamics underlying cognitive processing in real-life contexts, the limitations should be acknowledged. First, the participants in our MEG data did not overlap with that in the studyforrest project. Therefore, the fMRI-MEG fusion can be only performed at the group level (i.e., across participants) instead of at the individual level (i.e., within participants), thereby hampering the ability to study the individual differences in the coupling between spatially localized networks and temporal dynamics. Second, the dubbed languages used in our dataset and the studyforrest project were radically different, limiting the application of the data in examining spatiotemporal dynamics of brain activity underlying auditory processing and language. In addition, time differences between the stimuli in the MEG and fMRI data should be treated with caution. Considering the lower sensitivity of fMRI signals to the exact timing than that of MEG signals, we recommend the use of MEG stimuli for fusing fMRI and MEG data.  9 Topographic maps of ISC in different frequency bands derived from both the raw and preprocessed MEG data. A high ISC occurs near the visual and audio cortices in the delta, theta and alpha bands in the preprocessed data. The high ISC around the orbital area is only observed in the raw data but not in the ICA denoised data.