The studyforrest (http://studyforrest.org) dataset is likely the largest neuroimaging dataset on natural language and story processing publicly available today. In this article, along with a companion publication, we present an update of this dataset that extends its scope to vision and multi-sensory research. 15 participants of the original cohort volunteered for a series of additional studies: a clinical examination of visual function, a standard retinotopic mapping procedure, and a localization of higher visual areas—such as the fusiform face area. The combination of this update, the previous data releases for the dataset, and the companion publication, which includes neuroimaging and eye tracking data from natural stimulation with a motion picture, form an extremely versatile and comprehensive resource for brain imaging research—with almost six hours of functional neuroimaging data across five different stimulation paradigms for each participant. Furthermore, we describe employed paradigms and present results that document the quality of the data for the purpose of characterising major properties of participants’ visual processing stream.
Machine-accessible metadata file describing the reported data (ISA-tab format)
Background & Summary
The studyforrest dataset 1 , with its combination of functional magnetic resonance imaging (fMRI) data from prolonged natural auditory stimulation and a diverse set of structural brain scans, represents a versatile resource for brain imaging research with a focus on information processing under real-life like conditions. The dataset has, so far, been used to study the role of the insula in dynamic emotional experiences 2 , modeling of shared blood oxygenation level dependent (BOLD) response patterns across brains 3 , and to decode input audio power-spectrum profiles from fMRI 4 . The dataset has subsequently been extended twice, first with additional fMRI data from stimulation with music from various genres 5 and secondly with a description of the movie stimulus structure with respect to portrayed emotions 6 . However, despite providing three hours of functional imaging data per participant, experimental paradigms exclusively involved auditory stimulation, thereby representing a substantial limitation regarding the aim to aid the study of real-life cognition—which normally involves multi-sensory input. With this further extension of the dataset presented here and in a companion publication 7 , we are now substantially expanding the scope of research topics that can be addressed with this resource into the domain of vision and multi-sensory research.
This extension is twofold. While the companion publication 7 describes an audio-visual movie dataset with simultaneously acquired fMRI, cardiac/respiratory traces, and eye gaze trajectories, the present article focuses on data records and exams related to a basic characterization of the functional architecture of the visual processing stream of all participants—namely retinotopic organization and the localization of particular higher-level visual areas. The intended purpose of these data is to perform brain area segmentation or annotation using common paradigms and procedures in order to study the functional properties of areas derived from these standard definitions in situations of real-life like complexity. Moreover, knowledge about the specific spatial organization of visual areas in individual brains aids studies of the functional coupling between areas, and it also facilitates the formulation and evaluation of network models of visual information processing in the context of the studyforrest dataset.
The contributions of this study comprise three components: 1) results of a clinical eye examination for subjective measurements of visual function for all participants to document potential impairments of the visual system that may impact brain function, even beyond the particular properties relevant to the employed experimental paradigms; 2) raw data data for a standard retinotopic mapping paradigm and a six-category block-design localizer paradigm for higher visual areas, such as the fusiform face area (FFA) 8 , the parahippocampal place area (PPA) 9 , the occipital face area 10 , the extrastriate body area (EBA) 11 , and the lateral occipital complex (LOC) 12 ; and 3) validation analyses providing volumetric angle maps for retinotopy data and ROI masks for visual areas. While the first two components are raw empirical data, the third component is based on a largely arbitrary selection of analysis tools and procedures. No claim is made that the chosen methods are superior to any alternative, but the results are shared to document the plausibility of the results and to facilitate follow-up studies that do not require any particular method to analyze and interpret these data.
Fifteen right-handed participants (mean age 29.4 years, range 21–39, 6 females) volunteered for this study. All of them had participated in both previous studies of the studyforrest project 1,5 . The native language of all participants was German. The integrity of their visual function was assessed at the Visual Processing Laboratory, Ophthalmic Department, Otto-von-Guericke University, Magdeburg, Germany as specified below. Participants were fully instructed about the purpose of the study and received monetary compensation. They signed an informed consent for public sharing of all obtained data in anonymized form. This study was approved by the Ethics Committee of the Otto-von-Guericke University (approval refs 13,37).
Subjective measurements of visual function
To test whether the study participants had normal visual function and to detect critical reductions of visual function, two important measures were determined: (1) visual acuity to identify dysfunction of high resolution vision and (2) visual field sensitivity to localize visual field defects. For each participant, these measurements were performed for each eye separately—if necessary with refractive correction. (1) Normal decimal visual acuity (>=1.0) was obtained for each eye of each participant. (2) Visual field sensitivities were determined with static threshold perimetry (standard static white-on-white perimetry, program: dG2, dynamic strategy; OCTOPUS Perimeter 101, Haag-Streit, Koeniz, Switzerland) at 59 visual field locations in the central visual field (30° radius) i.e., covering the part of the visual field that was stimulated during the MRI scans. In all, except for two participants, visual field sensitivities were normal for each eye (MD (mean defect) dB<2.0 & >−2.0; LV (loss variance) dB2<6)—indicating the absence of visual field defects. Visual field sensitivities for participant 2 (right eye) and participant 4 (both eyes) were slightly lower than normal but not indicative of a distinct visual field defect.
Functional MRI acquisition setup
For all of the fMRI acquisitions described in the paper, the following parameters were used: T2*-weighted echo-planar images (gradient-echo, 2 s repetition time (TR), 30 ms echo time, 90° flip angle, 1943 Hz/px bandwidth, parallel acquisition with sensitivity encoding (SENSE) reduction factor 2) were acquired during stimulation using a whole-body 3 Tesla Philips Achieva dStream MRI scanner equipped with a 32 channel head coil. 35 axial slices (thickness 3.0 mm) with 80×80 voxels (3.0×3.0 mm) of in-plane resolution, 240 mm field-of-view (FoV), anterior-to-posterior phase encoding direction) with a 10% inter-slice gap were recorded in ascending order—practically covering the whole brain. Philips’ ‘SmartExam’ was used to automatically position slices in AC-PC orientation such that the topmost slice was located at the superior edge of the brain. This automatic slice positioning procedure was identical to the one used for scans reported in the companion article 7 and yielded a congruent geometry across all paradigms. Comprehensive meta data on acquisition parameters are available in the data release.
Pulse oximetry and recording of the respiratory trace were performed simultaneously with all fMRI data acquisitions using the built-in equipment of the MR scanner. Although the measurement setup yielded time series with an apparent sampling rate of 500 Hz, the effective sampling rate was limited to 100 Hz.
Visual stimuli were presented on a rear-projection screen inside the bore of the magnet using an LCD projector (JVC DLA RS66E, JVC Ltd., light transmission reduced to 13.7% with a gray filter) connected to the stimulus computer via a DVI extender system (Gefen EXT-DVI-142DLN with EXT-DVI-FM1000). The screen dimensions were 26.5 cm×21.2 cm at a resolution of 1280×1024 px with a 60 Hz video refresh rate. The binocular stimulation were presented to the participants through a front-reflective mirror mounted on top of the head coil at a viewing distance of 63 cm. Stimulation was implemented with PsychoPy v1.79 (with an early version of the
MovieStim2 component later to be publicly released with PsychoPy v1.81)
on the (Neuro)Debian operating system
. All employed stimulus implementations are available in the data release (
code/stimulus/). Participant responses were collected by a two-button keypad and was also logged on the stimulus computer.
Similar to previous studies 15,16 , traveling wave stimuli were designed to encode visual field representations in the brain using temporal activation patterns 17 . This paradigm was selected to allow for analyses with convential techniques, as well as population receptive field mapping 18 .
Expanding/contracting rings and clockwise/counter-clockwise wedges (see Fig. 1a) consisting of flickering radial checkerboards (flickering frequency of 5 Hz) were displayed on a gray background (mean luminance ≈100 cd m–2) to map eccentricity and polar angle. The total run time for both eccentricity and polar angle stimuli was 180 s, comprising five seamless stimulus cycles of 32 s duration each along with 4 and 12 s of task-only periods (no checkerboard stimuli) respectively at the start and the end.
The flickering checkerboard stimuli had adjacent patches of pseudo-randomly chosen colors, with pairwise euclidean distances in the Lab color space (quantifying relative perceptual differences between any two colors) of at least 40. Each of these colored patches were plaided with a set of radially moving points. To improve the perceived contrast, the points were either black or white depending on the color of the patch on which the points were located. The lifetime of these points was set to 0.4 s, a new point at a random location was initialised after that. With every flicker, the color of the patches changed to its complementary luminance. Simultaneously, the color changed and the direction of movement of the plaided points also reversed.
Eccentricity encoding was implemented by a concentric flickering ring expanding and contracting across the visual field (0.95° of visual angle in width). The ring was not scaled with cortical magnification factor. The concentric ring traveled across the visual field in 16 equal steps, stimulating every location in the visual field for 2 s. After each cycle, the expanding or the contracting rings were replaced by new rings at the center or the periphery respectively.
Polar angle encoding was implemented by a single moving wedge (clockwise and counter-clockwise direction). The opening angle of the wedge was 22.5 degrees. Similar to the eccentricity stimuli, every location in the visual field was stimulated for 2 s before the wedge was moved to the next position. Videos for all four stimuli are provided in the data release (
Center letter reading task
In order to keep the participants’ attention focused and to minimize eye-movements, they performed a reading task. A black circle (radius 0.4°) was presented as a fixation point at the center of the screen, superimposed on the main stimulus. Within this circle, a randomly selected excerpt of song lyrics was shown as a stream of single letters (0.5° height, letter frequency 1.5 Hz, 85% duty cycle) throughout the entire length of a run. Participants had to fixate, as they were unable to perform the reading task otherwise. After each acquisition run, participants were presented with a question related to the previously read text. They were given two probable answers, to which they replied by corresponding button press (index or middle finger of their right hand). These question only served the purpose of keep participants attentive—and were otherwise irrelevant. The correctness of the responses was not evaluated.
Participants performed four acquisition runs in a single session with a total duration of 12 min, with short breaks in-between and without moving out of the scanner. In each run, participants performed the center reading task while passively watching the contracting, counter-clockwise rotating, expanding, and clockwise rotating stimuli in exactly this sequential order. For the retinotopic mapping experiment, 90 volumes of fMRI data were acquired for each run.
Localizer for higher visual areas
All the stimuli for this experiment were used in a previous study
. There were 24 unique grayscale images from each of six stimulus categories: human faces, human bodies without heads, small objects, houses and outdoor scenes comprising of nature and street scenes, and phase scrambled images (Fig. 2b). Mirrored views of these 24×6 images were also used as stimuli. The original images were converted to grayscale and scaled to a resolution of 400×400 px. Images were matched in luminance using
lumMatch in the SHINE toolbox
to a mean and standard deviation of 128 and 70 respectively. The original images of human faces and houses were produced in the Haxby lab at Dartmouth College; human body images were obtained from Paul Downing’s lab at Bangor University
; images of small objects were obtained from the Bank of Standardized stimuli (BOSS)
; outdoor natural scenes are a collection of personal images and public domain resources; and street scenes are taken from the CBCL Street scene database http://cbcl.mit.edu/software-datasets/streetscenes/. Stimulus images were displayed at a size of approximately 10°×10° of visual angle. Stimulus images (original and preprocessed) are provided in the data release (
Participants were presented with four block-design runs, with two 16 s blocks per stimulus category in each run, while they also performed a one-back matching task to keep them attentive. The order of blocks was randomized so that all six conditions appeared in random order in both the first and second halves of a run. However, due to a coding error, the block-order was identical across all four runs; though the actual per-block image sequence was a different random sequence for each run. The block configuration and implementation of the matching task are depicted in Fig. 2a. 156 fMRI volumes were acquired during each experiment run.
Movie frame localizer
A third stimulation paradigm was implemented to collect BOLD fMRI data for an independent localization of voxels that show a response to basic visual stimulation in areas of the visual field covered by the movie stimulus used in the companion article
. The stimulus was highly similar to the one used for the retinotopic mapping, but instead of isolated rings and wedges, the dynamic stimulus covered either the full rectangle of the movie frame (without the horizontal bars at the top and bottom) or just the horizontal bars. The stimulus alternated every 12 s, starting with the movie frame rectangle. Stimulus movies are provided in the data release (
code/stimulus/movie_localizer/videos). A total of four stimulus alternation cycles were presented—starting synchronized with the acquisition of the first fMRI volume. A total of 48 volumes were acquired. During stimulation, participants performed the same reading task as in the retinotopic mapping session, hence a localization of responsive voxels assumes a central fixation and can only be considered as an approximation of the responsive area of the visual cortex during the movie session, where eye movements were permitted.
All custom source code for data conversion from raw, vendor-specific formats into the de-identified released form is included in the data release (
code/rawdata_conversion). fMRI data conversion from DICOM to NIfTI format was performed with
heudiconv (https://github.com/nipy/heudiconv), and the de-identification of these images was implemented with
The data release also contains the implementations of the stimulation paradigms in
code/stimulus/. Moreover, analysis code for visual area localization and retinotopic mapping is available in two dedicated repositories at https://github.com/psychoinformatics-de/studyforrest-data-visualrois, and https://github.com/psychoinformatics-de/studyforrest-data-retinotopy, respectively.
This dataset is compliant with the Brain Imaging Data Structure (BIDS) specification
, which is a new standard to organize and describe neuroimaging and behavioral data in an intuitive and common manner. Extensive documentation of this standard is available at http://bids.neuroimaging.io. This section provides information about the released data, but limits its description to aspects that extends the BIDS specifications. For a general description of the dataset layout and file naming conventions, the reader is referred to the BIDS documentation. In summary, all files related to the data acquisitions for a particular participant described in this manuscript can be located in a
sub-<ID>/ses-localizer/ directory, where
ID is the numeric subject code.
In order to de-identify data, information on center-specific study and subject codes have been removed using an automated procedure. All human participants were given sequential integer IDs. Furthermore, all BOLD images were ‘de-faced’ by applying a mask image that zeroed out all voxels in the vicinity of the facial surface, teeth, and auricles. For each image modality, this mask was aligned and re-sliced separately. The resulting tailored mask images are provided as part of the data release to indicate which parts of the image were modified by the de-facing procedure (de-face masks carry a
_defacemask suffix to the base file name).
In addition to the acquired primary data described in this section, we provide results of validation analysis described below. These are: 1) manually titrated ROI masks for visual areas localized for all participants (https://github.com/psychoinformatics-de/studyforrest-data-visualrois); and 2) volumetric and surface-projected eccentricity and polar angles maps from retinotopic mapping analysis (https://github.com/psychoinformatics-de/studyforrest-data-retinotopy).
Each image time series in NIfTI format is accompanied by a JSON sidecar file that contains a dump of the original DICOM metadata for the respective file. Additional standardized metadata is available in the task-specific JSON files defined by the BIDS standard.
fMRI data files for the retinotopic mapping contain a
*ses-localizer_task-retmap*_bold in their file name. Specifically the
retmapcon file name labels respecitvely indicate stimulation runs with clockwise and counterclockwise rotating wedges, and expanding and contracting rings.
Higher visual area localizers
fMRI data files for the visual area localizers contain a
*ses-localizer_task-objectcategories*_bold in their file name. The stimulation timing for each acquisition run is provided in corresponding
*_events.tsv files. These three-column text files describe the
duration of stimulus block and identify the associated stimulus category (
Movie frame localizer
fMRI data files for the movie frame localizer contain a
*ses-localizer_task-movielocalizer*_bold in their file name.
Time series of pleth pulse and respiratory trace are provided for all BOLD fMRI scans in a compressed three-column text file: volume acquisition trigger, pleth pulse, and respiratory trace (file name scheme:
_recording-cardresp_physio.tsv.gz). The scanner’s built-in recording equipment does not log the volume acquisition trigger nor does it record a reliable marker of the acquisition start. Consequently, the trigger log has been reconstructed based on the temporal position of a scan’s end-marker, the number of volumes acquired, and under the assumption of an exactly identical acquisition time for all volumes. The time series have been truncated to start with the first trigger and end after the last volume has been acquired.
All image analyses presented in this section were performed on the released data in order to test for negative effects of the de-identification procedure on subsequent analysis steps.
During data acquisition, (technical) problems were noted in a log. All known anomalies and their impact on the dataset are detailed in Table 1.
Temporal signal-to-noise ratio (tSNR)
Data acquisition was executed using the R5 software version of the scanner vendor. With this version, the vendor changed the frequency of the Spectral Presaturation by Inversion Recovery (SPIR) pulse from the previously 135 to 220 Hz in order to increase fat suppression efficiency. After completion of data acquisition, it was discovered that the new configuration led to undesired interactions with pulsations in the cerebrospinal fluid in the ventricles, which resulted in a reduced temporal stability of the MR signal around the ventricles. Figure 3a illustrates the magnitude and spatial extent of this effect. Despite this issue, the majority of voxels show a tSNR (ratio of mean and standard deviation of the signal across time) of ≈70 or above (Fig. 3b), as can be expected with a voxel volume of about 27 mm3 and with 3 Tesla acquisition 24 . Source code for the tSNR calculation is available at https://github.com/psychoinformatics-de/studyforrest-data-aligned/tree/master/code.
Retinotopic mapping analysis
Many regions of interest (ROI) in the human visual system follow a retinotopic organization
. The primary areas like V1 and V2 are also provided as labels with the Freesurfer segmentation using the
. But the higher visual areas (V3, VO, PHC, etc) need to be localized by retinotopic mapping
We implemented a standard analysis pipeline for the acquired fMRI data based on standard algorithms publicly available in the software packages 26 , FSL 33 , and AFNI 34 . All analysis steps were performed on a computer running the (Neuro)Debian operating system 14 , and all necessary software packages (except for Freesurfer) were obtained from system software package repositories.
BOLD images time series for all scans of the retinotopic mapping paradigm were brain-extracted using FSL’s BET and aligned (rigid-body transformation) to a participant-specific BOLD template image. Computed transformation are available at https://github.com/psychoinformatics-de/studyforrest-data-templatetransforms. All volumetric analysis was performed in this image space. An additional rigid-body transformation was computed to align the BOLD template image to the previously published cortical surface reconstructions based on T1 and T2-weighted structural images of the respective participants
for later delineation of visual areas on the cortical surface. Using AFNI tools, time series images were also ‘deobliqued’ (
3dWarp), slice time corrected (
3dTshift), and temporally bandpass-filtered (
3dBandpass cutoff frequencies set to 0.667/32 Hz and 2/32 Hz, where 32 s is the period of both the ring and the wedge stimulus).
For angle map estimation, AFNI’s
waver command was used to create an ideal response time series waveform based on the design of the stimulus. The bandpass filtered BOLD images were then processed by the
DELAY phase estimation method was based on the response time series model). Expanding and contracting rings, as well as clockwise and counter-clockwise wedge stimuli, were jointly used to generate average volumetric phase maps representing eccentricity and polar angles for each participant. Polar angle maps were adjusted for a shift in the starting position of the wedge stimulus compared between the two rotation directions. The phase angle representations, relative to the visual field, are shown in Fig. 1a. As an overall indicator of mapping quality, Fig. 1b shows the distribution of the polar angle representations across all voxels in the MNI occipital lobe mask combined for all participants.
For visualization and subsequent delineation, all volumetric angle maps (after correction) were projected onto the cortical surface mesh of the respective participant using Freesurfer’s
mri_vol2surf command—separately for each hemisphere. In order to illustrate the quality of the angle maps, the subjectively best, average, and worst participants (respectively: participant 1, 10, and 9) have been selected on the basis of visual inspection. Figure 1c shows the eccentricity maps on the left panel and the polar angle maps for both hemispheres on the right panel. A table summarizing the results of the manual inspections of all surface maps is available at https://github.com/psychoinformatics-de/studyforrest-data-retinotopy/tree/master/qa. Delineations of the visual areas depicted in Fig. 1c were derived according to Kaule et al.
(page 4). Further details on the procedure can be found in refs 28,
Localization of higher visual areas
To localize higher visual areas for each participant, we implemented a standard two-level general linear model (GLM) analysis using the FEAT component in FSL. BOLD image time series were slice-time-corrected, masked with a conservative brain mask, spatially smoothed (Gaussian kernel, 4 mm FWHM), and temporally high-pass filtered using a cutoff period of 100 s. For each acquisition run, we defined the stimulation design using six boxcar functions, one for each condition (bodies, faces, houses, small objects, landscapes, scrambled images), such that each stimulation block was represented as a single contiguous 16 s segment. The GLM design, comprised of these six regressors, convolved with FSL’s ‘Double-Gamma HRF’ as a model hemodynamic response function model. Temporal derivatives of those regressors were also included in the design matrix, and it was subjected to the same temporal filtering as the BOLD time series.
At the first level, we defined a series of t-contrasts to implement different localization procedures found in the literature 36 . The strict set included one contrast per target region of interest and involved all stimulus conditions (one condition versus all others, except for the PPA contrast, where houses/landscapes were contrasted with all other conditions). The relaxed set included structurally similar contrasts as the strict set, but the number of contrast condition was reduced, for example: the FFA contrast was defined as faces versus small objects and scrambled images. Lastly, the simple set contained only contrasts of one (e.g., faces) or two related conditions (e.g., houses and landscapes) against responses to scrambled images.
The GLM analysis was performed for each experiment run individually, and afterwards results were aggregated in a within-subject second-level analysis by averaging. Statistical evaluation (fixed-effects analysis) and cluster-level thresholding were performed at the second level using a cluster forming threshold of Z>1.64 and a corrected cluster probability threshold of P<0.05.
We defined category-selective regions starting with the contrast clusters that survived second-level analysis for each participant. For each region of interest, we started with the most conservative contrast (strict set) by using a threshold of t=2.5 and looked for clusters with at least 20 voxels (using AFNI). We titrated the threshold in the range of [2, 3] until we found an isolated cluster for the localizer region of interest. If a cluster was not found or not isolated, we used a contrast from the relaxed set, or finally the simple set, and repeated the process until we found a cluster that matched the expected anatomical location based on literature for FFA/OFA 37 , PPA 9 , LOC 12 , and EBA 11 .
Figure 4 depicts the results of this procedure for all regions of interest by means of localization overlap across all participants on the cortical surface of the MNI152 brain. Detailed participant-specific information is provided in Table 2 (face-responsive regions), Table 3 (scene and place responsive regions), and Table 4 (early visual areas and LOC). Both the spatial localization of regions in the groups of participants, as well as the frequency of localization success, approximately matches reports in the literature (for example 8 ).
The source code for this analysis, and the area masks for all participants are available at https://github.com/psychoinformatics-de/studyforrest-data-retinotopy.
The procedures we employed in this study resulted in a dataset that is highly suitable for automated processing. Data files are organized according to the BIDS standard 22 . Data are shared in documented standard formats, such as NIfTI or plain text files, to enable further processing in arbitrary analysis environments with no imposed dependencies on proprietary tools. Conversion from the original raw data formats is implemented in publicly accessible scripts; the type and version of employed file format conversion tools are documented. Moreover, all results presented in this section were produced by open source software on a computational cluster running the (Neuro)Debian operating system 14 . This computational environment is freely available to anyone, and it—in conjunction with our analysis scripts—offers a high level of transparency regarding all aspects of the analyses presented herein.
All data are made available under the terms of the Public Domain Dedication and License (PDDL; http://opendatacommons.org/licenses/pddl/1.0/). All source code is released under the terms of the MIT license (http://www.opensource.org/licenses/MIT). In short, this means that anybody is free to download and use this dataset for any purpose as well as to produce and re-share derived data artifacts. While not legally required, we hope that all users of the data will acknowledge the original authors by citing this publication and follow good scientific practise as laid out in the ODC Attribution/Share-Alike Community Norms (http://opendatacommons.org/norms/odc-by-sa/).
How to cite this article: Sengupta, A. et al. A studyforrest extension, retinotopic mapping and localization of higher visual areas. Sci. Data 3:160093 doi: 10.1038/sdata.2016.93 (2016).
Hanke, M. OpenfMRI ds000113d (2016)
We acknowledge the support of the Combinatorial NeuroImaging Core Facility at the Leibniz Institute for Neurobiology in Magdeburg. M.H. was supported by funds from the German federal state of Saxony-Anhalt and the European Regional Development Fund (ERDF), Project: Center for Behavioral Brain Sciences. This research was, in part, co-funded by the German Federal Ministry of Education and Research (BMBF 01GQ1112, 01GQ1411) and the US National Science Foundation (NSF 1129855, 1429999) as part of two US-German collaborations in computational neuroscience (CRCNS). We are grateful to the Haxby lab, Paul Downing, the maintainers of the BOSS stimulus library, and the CBCL street scene library for making the stimulus material publicly available. We appreciate the editing efforts of Alex Waite and do publicly acknowledge the excellent beer making traditions of the great state of Wisconsin.