Rapid event-related, BOLD, NHP: choose two out of three

Human functional magnetic resonance imaging (fMRI) typically employs the blood-oxygen-level-dependent (BOLD) contrast mechanism. In non-human primates (NHP), contrast enhancement is possible using monocrystalline iron-oxide nanoparticles (MION) contrast agent, which has a more temporally extended response function. However, using BOLD fMRI in NHP is desirable for interspecies comparison, and the faster response of the BOLD signal promises to be beneficial to rapid event-related (rER) designs. Here, we used rER BOLD fMRI in macaque monkeys while viewing real-world images, and found visual responses and category-selectivity consistent with previous studies. However, activity estimates were very noisy, suggesting that the lower contrast-to-noise ratio of BOLD, suboptimal behavioural performance, and motion artefacts, in combination, render rER BOLD fMRI challenging in NHP. Previous studies have shown that rER monkey fMRI is possible with MION, despite its prolonged response function. To understand this, we conducted simulations of the BOLD and MION response during rER designs, and found that no matter how fast the design, the greater amplitude of the MION response outweighs the contrast loss caused by greater temporal smoothing. We conclude that although any two of the three elements (rER, BOLD, NHP) have been shown to work well, the combination of all three is particularly challenging.


Introduction
Functional magnetic resonance imaging (fMRI) has enabled the acquisition of whole-brain images of brain activity in humans and other animals. The technique has been used to functionally localize brain regions, with particular success in localizing regions selective for different visual categories, including face-, body-, object-, and place-selective areas in humans [1][2][3] and non-human primates (NHP) [4][5][6][7][8] .
Human fMRI studies typically use the endogenous contrast agent deoxyhemoglobin, and measure the blood-oxygen-level-dependent (BOLD) signal. BOLD has been used in humans with a wide variety of experimental designs, including rapid event-related designs that give researchers great flexibility. In particular, rapid event-related fMRI enables condition-rich designs intended for pattern-information analyses 9 . BOLD fMRI has also been utilised by NHP studies (see below), however, many NHP studies have used the exogenous contrast agent monocrystalline iron oxide nanoparticle (MION) to increase the sensitivity of the measured signal. MION reflects blood volume, rather than blood oxygenation. Vanduffel et al. 10 compared the use of BOLD vs. MION in block-design experiments in awake macaque monkeys to map the brain areas selective for motion. Their results not only matched monkey electrophysiology and human fMRI results, but also showed greater spatial localization and contrast increase in MION relative to BOLD. More recently, block-designs combined with MION have been predominantly used to localize fMRI-defined category-selective areas in macaques (for example 4,7,8,11 ).
MION's slower haemodynamic response is unproblematic in the context of block designs. Leite et al. 12 compared MION with BOLD in macaques using visual checkerboard stimuli with varying presentation durations. They found that MION increased the functional sensitivity for stimuli presented at long durations, but brief or rapidly repeated stimulus presentations led to a greater attenuation of the signal compared to BOLD, consistent with a linear model capturing the dispersion of the response over time. This suggests that MION might be less sensitive for rapid event-related designs, whose high-temporal-frequency effects might not pass through the low-temporal-frequency filter of the MION response. However, eventrelated designs have been successfully used in MION fMRI studies previously 13,14 .
To understand the functional homologies and analogies between the human and the NHP brain, it would be desirable to use the same contrast mechanism in both species. Given that administering MION is an invasive procedure, not approved for routine use in humans, BOLD in NHPs might be the best approach for interspecies comparisons. Indeed, Pinsk et al. 5,6 investigated visual category-selectivity using BOLD with block-designs, whereas BOLD in an event-related design with long delays was used by Kagan et al. 15 and more recently by Kaskan et al. 16 . Interestingly, the faster temporal response in BOLD fMRI might be beneficial in the context of rapid event-related designs.
Here, we explore block-design and rapid event-related (rER) BOLD fMRI in awake macaques using visual images of real-world stimuli including human and animal faces, human and animal bodies, objects, and places. In the rER experiment, each stimulus was presented for 0.5 s, and there was a 2.5 s inter-stimulus interval (see Methods). Therefore, we define our event-related design as 'rapid' on the basis that the interval between successful stimulus presentations was shorter than the duration of the hemodynamic response function 17 . We selected these stimuli because they have been shown to evoke strong category-selective visual responses in higher-order visual areas in both humans and macaques (see above).
We found clear and strong visual responses and some selectivity to categories, consistent with findings reported in previous studies, in both our block-and rER experiments. However, in the rER experiment, even after censoring scan volumes where our behavioural performance criteria were not met, and after substantial averaging, responses were quite noisy compared to (a) human rapid event-related BOLD fMRI 18 , (b) monkey rapid event-related MION fMRI 14 , and (c) our own block-design BOLD fMRI experiment. We cannot rule out that factors related to the suboptimal performance of our animals may have affected the responses we obtained.
Nevertheless, in every volume, our animals maintained eye position, for over half the time, within their fixation window at >82% (see Methods), and eye fixations showed reasonable position stability (see Supplementary Information)

Results
We collected fMRI data while three macaque monkeys (M1-M3) were viewing visual images presented at the centre of a computer monitor.
We first ran a block-design fMRI experiment whose data were also used to define the regions of interest (ROIs) used in our event-related fMRI experiment conducted a few months afterwards. In the event-related experiment, we sought to probe the emergence of selectivity to different object categories in the ventral visual stream, using a stimulus set that has been successfully employed in NHP and human studies previously 14,18,20 . Additionally, given the predominant categorical organisation of face-selective regions on the macaque superior temporal sulcus (STS) 7,21 , we were particularly interested in evaluating the (dis)similarities in the activity patterns 9 elicited by the individual images in these regions.
Block-design Experiment. We found strong visual responses for most hemispheres, in the occipital and temporal lobes (figures 1 and 2). Furthermore, we identified anterior, middle, and posterior face-selective regions in the STS (see figure 3 for M1).
In particular, to localise face-selective regions in each animal, we contrasted face and place stimuli using the data from the present, block-design, experiment. To increase the chances that we captured the face-selective locations in all monkeys, we selected a liberal threshold of z=1.6. Consistent with the 'face patches' reported previously 7,8 , we found brain we did not identify a face-selective region (as was the case in M3, where no face-selective voxels were found in anterior or posterior left STS), we used the coordinates from another monkey in our study to generate the mask for the specific missing ROI.
The face-selective regions we found were in close proximity to the ones identified when contrasting faces and objects (figure 3B), further confirming the strong selectivity for face stimuli in macaque STS, as well as providing reassurance that the faces>places contrast we used is appropriate for revealing face selectivity 4 . Figure 4 shows percent signal change data across all ROIs for the event-related experiment. The bars depict data averaged across the individual stimuli within each category, then averaged across sessions (all subjects). We found no category-selectivity in early visual cortex V1 or V2. Rather, category-selectivity seems to emerge for the first time at higher levels of visual processing. Specifically, a significant main effect of stimulus category was first observed in area V4 F(3,75) = 3.15, p =.030). As figure 4A shows, V4 responses to the images of places were greater compared to the responses to the rest of the images. Farther along the ventral cortex, a significant main effect of category (F(3,75) = 2.83, p =.044) was found in area TEO, where responses to body-part images were greater compared to the other categories. Area TEm showed a significant main effect of category (F(3,75) = 2.9, p = .041) with a preference to body-parts. We did not find a significant main effect in TEpd (F(3,75) = 2.19, p = .096), however, responses to body-parts were greater compared to the other categories. Moving more anterior in IT cortex into TEad and TEa, we observed responses lower than baseline to almost all stimulus categories and no significant category-selectivity in either region (p's>0.05).

Event-related Experiment: Univariate Analysis.
We also extracted data from the posterior, middle and anterior face-selective ROIs. In the posterior face-selective ROI, we found a significant main effect of category (F(3,75) = 4.46, p = .006), with greater responses observed to face images. In the middle face-selective STS area, we found a significant main effect of category (F(3,75) = 2.95 , p = .038), with greater responses observed to body-part images. Finally, in the anterior STS area, we found greater responses to faces compared to the other categories, but this did not reach statistical significance ( figure 4B).
Finally, similar to the block-design experiment, we contrasted face and place stimuli to generate face-selectivity maps in the event-related experiment. We selected a threshold of 1.6, uncorrected. As shown in figure 5, face-selective regions emerged in reasonable STS locations, however, activations were less strong than in the block-design experiment (figure 3A).
Furthermore, contrary to the block-design results, no active voxels survived cluster correction here.

MION dominates BOLD for simulated event-related monkey fMRI.
Our BOLD fMRI rapid event-related response estimates are noisy, suggesting that BOLD rapid event-related designs, although successful in humans, are challenging in monkeys. An important question is whether rapid event-related designs might work better in monkeys when MION is used.
Rapid event-related designs can work with MION [12][13][14]19 . However, it is unclear how the larger amplitude of the MION response (which helps sensitivity) (see figure 7A) trades off against its larger temporal width (which might reduce the differential sensitivity to fast switching stimuli in rapid event-related designs). Leite and Mandeville 19 argued on the basis of simulations, that MION more than BOLD benefits from randomization of the stimulus timing, which moves effect energy into lower temporal-frequency bands. Even if high temporalfrequency effects are significantly attenuated in MION fMRI, they could still be stronger than in BOLD fMRI. The power spectrum of the MION model response and linear-model simulations indeed suggests that MION should have greater sensitivity than BOLD in general, i.e. for any type of design 12,19 . However, the MION model used by Leite et al. 12,19 has a sharp onset, which, on one hand, might not realistically reflect the actual MION response and, on the other, might enable the response model to transmit more high-temporal-frequency information than the actual MION response.
To assess more conservatively whether MION is theoretically superior to BOLD even for rapid event-related designs, we performed analyses and simulations using a modified version of the Leite et al. 12  To address these questions more generally, we analysed the impulse response of BOLD and MION in the frequency domain ( figure 8). We added versions of the BOLD and MION impulse response functions with an even smoother onset than that of Boynton et al. 26 . Results demonstrate that MION has full-spectrum dominance over BOLD.

Figure 8: MION affords greater sensitivity than BOLD under conservative assumptions about the onset for arbitrarily rapid event-related designs. (A) Different impulse response function models considered for BOLD (blue) and MION (red). The most conservative models (dashed lines in saturated red and blue) have a smooth onset (identical for BOLD and MION)
, which transmits less high-temporalfrequency effect energy than the Boynton et al. 26 BOLD model, and much less than the Leite et al. 12 MION model. (B) Periodograms show that for conventional as well as our more conservative smoothonset models, MION dominates BOLD in terms of its transmission of effect energy across the full spectrum of temporal frequencies. We therefore expect that MION will provide greater sensitivity to effects of interest, no matter how rapid the event-related design.
We tested the robustness of MION's full-spectrum dominance to changes of the assumed factor by which the peak of the MION response exceeds that of the BOLD response.
The previous simulations assumed that the MION response peaks 1.8 times higher than the BOLD response (when the temporal noise is equated). We relaxed this assumption by gradually lowering the peak amplitude of the MION response in the simulation (not shown). MION maintained its full-spectrum dominance over BOLD for rapid event-related designs down to a factor of 1.5 (peak of MION / peak of BOLD) for the smooth-onset variants of both impulse response functions. In sum, the simulations suggest that MION robustly dominates BOLD under conservative assumptions. We expect that MION will yield greater sensitivity no matter what experimental design is used.

Discussion
We collected BOLD fMRI data from awake behaving macaque monkeys, focusing on the occipital and temporal cortices in two independent experiments: a block-design experiment and a rapid event-related experiment. Each experiment used an independent set of visual images of real-world stimuli. In both experiments, we found visual responses and category selectivity.
However, the effects were noisier than expected, especially in the event-related experiment.
In the block-design experiment, we found strong visual responses in the occipital and temporal lobes of all three subjects. Furthermore, we were able to identify bilateral anterior, middle, and posterior face-selective regions for most of the subjects. These face-selective regions were in the regions expected, but less specific than those reported in the literature using MION. For example, Tsao et al. 8 found six face patches in each hemisphere, where there are two patches for each of the anterior, middle, and posterior parts of the STS. Here, we found correspondence in the anatomical locations, with some subjects showing two patches in each portion of the STS, but these were not easily identifiable in all subjects possibly due to the lower functional contrast of BOLD. However, this could also be because Tsao and colleagues collected more volumes per subject, or had more stimulus repetitions in their block design experiments 8 . We also cannot rule out the possibility that our monkeys' (who were not fluidrestricted during testing) performance on the task may have negatively affected our observations. For example, on average, we had to exclude some 15% of the collected MRI volumes because monkeys did not sustain viewing within the fixation window, whereas, in the remaining data, fixation eye showed reasonable, yet imperfect, position stability (see Methods and Supplementary Information).
In the event-related experiment, the stimulus-evoked BOLD responses were substantially noisier. Using an ROI approach, we considered ventral stream brain regions from early visual cortices to anterior IT. In early visual areas, we found strong visual responses, but no significant category selectivity. Beyond early visual cortex, we found that category-selectivity begins to emerge. Specifically, we found some category selectivity in V4, TEO, and TEm, as well as in the face-selective regions in the STS. However, as we reach regions in anterior IT such as TEad and TEa, we found no evidence of category-selective responses. This could be related to the relatively weaker fMRI signal found in these regions, but could also be because these regions are more involved in distinguishing identities within a particular category (e.g., refs 27-29 -but see 4 ). RSA analyses using noise covariance-normalized distances (crossnobis distances) on face-selective regions found some differences between early visual areas and face regions. The RDMs appeared qualitatively different from each other across areas, but the pattern dissimilarities within a brain region were too noisy for detailed analyses of the representational geometries. Brain activity patterns in early visual areas were strongly dissimilar across stimulus conditions. In the face-selective regions, by contrast, there was only very weak structure suggesting some information about stimulus category. Overall, we found strong visual responses and some category-selectivity in both our block-and event-related designs.
However, the data were noisy even after artefact rejection and substantial averaging, which we attribute to the lower contrast-to-noise ratio of BOLD compared to MION, the smaller brains of NHPs, as well as eye-movement-and motion-related artefacts.
Collecting MRI data using MION was not an option under the project license of our study, therefore, we finally performed simulations based on the known response properties of BOLD and MION to compare the response profiles between the two contrast mechanisms.
Considering the slower temporal response profile of MION, and previous findings of a more attenuated differential response in MION compared to BOLD at faster rates of stimulus switching 12 , one might expect that BOLD will work better than MION for rapid event-related designs. However, our simulations suggest that at every timescale of stimulus presentation, MION dominates BOLD in terms of sensitivity.
Overall, the stronger BOLD responses we measured in the block-design experiment compared to the rapid event-related experiment suggests that block designs may be a better choice than event-related designs when using BOLD fMRI in NHPs, and our simulations suggest that MION may still be better than BOLD even for rapid event-related designs. In the block-design experiment, we used a subset of the stimuli used by our group previously 33 . Here, our monkeys were presented with images of faces, objects, places, and scrambled versions of objects. In the event-related experiment, the monkeys were presented with a different stimulus set (a subset of the stimuli used in ref 14,18,20 ) that consisted of 48 images of human and animal faces and body parts, man-made and natural objects, and places. Raw data were reconstructed offline using a sensitivity encoding (SENSE -see ref 34 ) reconstruction method in Matlab, to reduce ghosting artefacts 35 (Offline SENSE GUI, Windmiller Kolster Scientific, Fresno, CA). To reduce artefacts caused by body motion 36 we further used motion-correction algorithms 37 as follows: all volumes within a run were aligned slice-by-slice to the single volume identified as having the least amount of motion (least variance from the mean). The aligned data obtained in the same session were merged into a single 4D NIFTI file, using the Functional MRI of the Brain (FMRIB) Software Library (FSL; www.fmrib.ox.ac.uk/fsl) 38 .

Methods
In FSL, the 4D data were skull-stripped and subjected to spatial smoothing (full-width half maximum of 3mm), intensity normalisation, and high-pass filtering (cutoff 60 s). Finally, we spatially co-registered the functional data to a standard anatomical template (MACAQUE-F99 22 ) using affine transformation.

Eye-movements and Motion Artefacts.
We analysed eye-tracking data to identify and exclude trials where the monkeys broke fixation. Eye-movements were monitored using an MRcompatible LED infrared camera (MRC Systems GmbH, Germany). Eye position was calibrated at the beginning of each session. This calibration procedure was part of the animals' regular behavioural training in the mock scanner. In the scanner, eye-movements were recorded at 20 Hz, that is, ~40 samples were obtained in each volume. A volume was excluded if the subject broke fixation for more than half the time in that volume. For each subject, the mean percentage of volumes per session excluded due to broken fixations were: M1=16.3% (standard error of the mean -SEM=2.4%); M2=17.1% (SEM=7.0%); M3=10.0% (SEM=6.7%). Furthermore, recent NHP studies have revealed that the superior and inferior banks of the macaque STS contain several fMRI-identified face-selective regions [4][5][6][7][8] . To reveal such faceselective regions, we contrasted face and place stimuli (from our Block-design experiment data) similarly to previous studies 4,42,43 .
To compare category-selectivity across different parts of the brain in our event-related experiment, we equated the size of all our ROIs 44 . Specifically, for V1, V2, V4, TEO and TE, we created a 2 mm radius spherical mask around the voxel with peak visual activation (ON>OFF contrast in our block-design experiment), within each area (figures 1 and 2). Note that within V4, the spherical masks for the three monkeys were located in the ventral portion of V4 45,46 . For the functionally-defined, face-selective, STS areas, we created a 2 mm radius spherical mask around the peak face-selective voxel (faces>places contrast in the block-design), in the posterior, middle, and anterior STS (figure 3A).
The mask generation pipeline that was applied to all ROIs across both hemispheres for each animal was as follows. Within a given mask, a sphere was generated around the peak visual-or the peak face-selective voxel from our block-design experiment across all sessions within each animal. Before extracting fMRI data from the spherical masks, masks were coregistered to each individual scanning session's example functional image to align with the functional space of each session. Final spherical ROIs had approximately equal volume (~30 mm 3 ) across animals. We chose spheres of this size so that our spherical ROIs approximately matched the volume of our smallest functionally-defined region (a cluster of face-selective voxels in the anterior STS).
Event-related fMRI: Data Analysis. We used custom-written code in Matlab to temporally coregister the stimulus presentation times with fMRI volumes and eye-tracking recordings.
Statistical analyses were performed using FSL's FMRI Expert Analysis Tool (FEAT) Version 6.00 by estimating a general linear model (GLM). For each session, each image was modelled as an explanatory variable (EV, i.e., regressor). Monkeys' head motion outliers (see above) were included in the GLM, as additional EVs of no interest. All EVs were convolved by a hemodynamic response function (HRF) adjusted to reflect the macaque BOLD HRF, which is faster than in humans: we used a gamma HRF with 3 s mean lag and 1.5 s standard deviation (see refs 15,47,48 ). We set up one contrast (stimulus > baseline) for each of our 48 images. Zstatistic images arose from EVs according to the following pipeline: each EV in the design matrix resulted in a parameter estimate (PE) image indicating the fit of a waveform model to the data in each voxel. A PE image was converted to a t-statistic image by dividing the PE by its standard error (deriving from the residual noise after the complete model was fit). T-statistic images were then converted to z-statistic images following standard statistical transformations.
The beta weight for each stimulus EV, within each ROI, was extracted and converted to % signal change using the Featquery tool in FSL. In every ROI, we averaged the data from the two hemispheres.
Event-related fMRI: Representational Similarity Analysis. To perform representational similarity analysis (RSA) 9 on face-selective regions and early visual cortex, we extracted the pre-processed fMRI data from these ROIs. ROIs included bilateral anterior, middle, and posterior face-selective regions, and early visual regions bilateral V1 and V2. For the faceselective regions, we used the localization procedure described above, and created new spherical masks with a 5 mm radius. We used larger spherical masks with more voxels for the RSA to improve sensitivity of these analyses, since cross-validated distance measures are based on voxel patterns rather than simply the mean activation in an area. For the early visual regions, we used anatomically-derived masks from ref 25 .
We loaded unsmoothed data from each spherical ROI for each face-selective region in each subject into Matlab and constructed GLMs using custom-written Matlab code. The EVs were modelled as above, with each image modelled as one variable for each run. Each run was modelled separately, in order to perform cross-validation across runs within each session. A GLM-based analysis was performed on each run for each animal and each ROI, which produced a vector of beta weights for each ROI. We used these beta weights to produce representational dissimilarity matrices (RDMs). We used the cross-validated Mahalanobis (or crossnobis) distance, for the distance measure in the RDMs, representing the dissimilarity between two sets of voxel-wise brain patterns 49,50 .
The crossnobis distance was computed as follows: where and are vectors of beta weights (fMRI voxel activation patterns) to be compared for image k and j, A denotes the training set and B denotes the test set, ΣA is the noise covariance matrix estimated from the residuals of the GLM for this ROI in the training set A (see ref 50 ), and T means transpose. The crossnobis distance was computed between each image in each ROI.
Cross-validation was performed across runs within each session. Given that trials with excessive eye-movements were excluded, not every run included a trial for each image, and therefore it was not possible to use a leave-one-run-out method for computing the crossnobis distance. Instead, we used a split-half approach to estimate the pairwise crossnobis distances.
For each session, we randomly assigned half the runs as the training set A and half the runs as the test set B (with one of the runs left out of the analysis when there were odd numbers of runs within a session). This was done 50 times to produce 50 cross-validated distances, and the distances were averaged across cross-validation folds. The same procedure was performed for each session, and the RDMs were averaged across sessions. The noise covariance matrix used was estimated based on the training data. To produce this matrix, we obtained the residuals R estimated from the GLM from an ROI, which is a T (number of time points) x P (number of voxels). The P x P noise covariance matrix can be then estimated by:

Σ = 1
As in the univariate analysis, the RDMs were averaged across hemispheres which gave three RDMs for face-selective regions and three RDMs for early visual areas for each monkey.