A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence

Allen, Emily J.; St-Yves, Ghislain; Wu, Yihan; Breedlove, Jesse L.; Prince, Jacob S.; Dowdle, Logan T.; Nau, Matthias; Caron, Brad; Pestilli, Franco; Charest, Ian; Hutchinson, J. Benjamin; Naselaris, Thomas; Kay, Kendrick

doi:10.1038/s41593-021-00962-x

Resource
Published: 16 December 2021

A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence

Nature Neuroscience volume 25, pages 116–126 (2022)Cite this article

34k Accesses
85 Citations
240 Altmetric
Metrics details

Subjects

Abstract

Extensive sampling of neural activity during rich cognitive phenomena is critical for robust understanding of brain function. Here we present the Natural Scenes Dataset (NSD), in which high-resolution functional magnetic resonance imaging responses to tens of thousands of richly annotated natural scenes were measured while participants performed a continuous recognition task. To optimize data quality, we developed and applied novel estimation and denoising techniques. Simple visual inspections of the NSD data reveal clear representational transformations along the ventral visual pathway. Further exemplifying the inferential power of the dataset, we used NSD to build and train deep neural network models that predict brain activity more accurately than state-of-the-art models from computer vision. NSD also includes substantial resting-state and diffusion data, enabling network neuroscience perspectives to constrain and enhance models of perception and memory. Given its unprecedented scale, quality and breadth, NSD opens new avenues of inquiry in cognitive neuroscience and artificial intelligence.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Design of the NSD experiment.**

**Fig. 3: Improving SNR through novel response estimation and denoising methods.**

**Fig. 4: Reliable and long-term recognition memory effects.**

**Fig. 5: Representational similarity analysis reveals transformations of representations along the ventral visual stream.**

**Fig. 6: Prediction of brain activity using a brain-optimized neural network.**

Two common and distinct forms of variation in human functional brain networks

Article 30 April 2024

Whole-cortex in situ sequencing reveals input-dependent area identity

Article Open access 24 April 2024

Dimensionality reduction beyond neural subspaces with slice tensor component analysis

Article Open access 06 May 2024

Data availability

The NSD dataset is freely available at http://naturalscenesdataset.org. The data are hosted in the cloud, allowing researchers to exploit high-performance cloud computing to efficiently analyze the dataset. We provide both raw data in BIDS format⁸⁶ and prepared data files, along with extensive technical documentation in the NSD Data Manual. To ensure strict validation for an upcoming Algonauts prediction challenge⁸⁷, the initial public release will withhold the last three NSD scan sessions from each participant (approximately 8.4% of the NSD data). Images used for the NSD were taken from the Common Objects in Context database¹⁴ (https://cocodataset.org).

Code availability

We provide an archive of code used in this study (https://github.com/cvnlab/nsddatapaper/) as well as utility functions for working with the prepared NSD data (https://github.com/cvnlab/nsdcode/). Custom algorithms developed for this study include GLMsingle (https://github.com/cvnlab/GLMsingle/) and fracridge (https://github.com/nrdg/fracridge/). Example scripts demonstrating scientific analyses of the NSD data are available (https://github.com/cvnlab/nsdexamples/); these scripts might be useful for teaching purposes.

References

de Vries, S. E. J. et al. A large-scale standardized physiological survey reveals functional organization of the mouse visual cortex. Nat. Neurosci. 23, 138–151 (2020).
Article PubMed Google Scholar
Siegle, J. H. et al. Survey of spiking in the mouse visual system reveals functional hierarchy. Nature 592, 86–92 (2021).
Stringer, C., Pachitariu, M., Steinmetz, N., Carandini, M. & Harris, K. D. High-dimensional geometry of population responses in visual cortex. Nature 571, 361–365 (2019).
Article CAS PubMed PubMed Central Google Scholar
Markram, H. et al. Reconstruction and simulation of neocortical microcircuitry. Cell 163, 456–492 (2015).
Article CAS PubMed Google Scholar
Van Essen, D. C. et al. The WU-Minn human connectome project: an overview. Neuroimage 80, 62–79 (2013).
Article PubMed Google Scholar
Zheng, Z. et al. A complete electron microscopy volume of the brain of adult Drosophila melanogaster. Cell 174, 730–743 (2018).
Article CAS PubMed PubMed Central Google Scholar
Van Essen, D. C. et al. Mapping visual cortex in monkeys and humans using surface-based atlases. Vis. Res. 41, 1359–1378 (2001).
Article PubMed Google Scholar
Grill-Spector, K. & Malach, R. The human visual cortex. Annu. Rev. Neurosci. 27, 649–677 (2004).
Article CAS PubMed Google Scholar
Wheeler, M. E., Petersen, S. E. & Buckner, R. L. Memory’s echo: vivid remembering reactivates sensory-specific cortex. Proc. Natl Acad. Sci. USA 97, 11125–11129 (2000).
Article CAS PubMed PubMed Central Google Scholar
Breedlove, J. L., St-Yves, G., Olman, C. A. & Naselaris, T. Generative feedback explains distinct brain activity codes for seen and mental images. Curr. Biol. 30, 2211–2224 (2020).
Article CAS PubMed Google Scholar
Kay, K. N., Weiner, K. S. & Grill-Spector, K. Attention reduces spatial uncertainty in human ventral temporal cortex. Curr. Biol. 25, 595–600 (2015).
Article CAS PubMed PubMed Central Google Scholar
Huth, A. G., Nishimoto, S., Vu, A. T. & Gallant, J. L. A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 76, 1210–1224 (2012).
Article CAS PubMed PubMed Central Google Scholar
Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images. https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf (University of Toronto, 2009).
Lin, T.-Y. et al. Microsoft COCO: Common Objects in Context. European Conference on Computer Vision. https://link.springer.com/chapter/10.1007/978-3-319-10602-1_48, 740–755 (Springer, 2014).
Güçlü, U. & van Gerven, M. A. J. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 35, 10005–10014 (2015).
Article PubMed PubMed Central Google Scholar
Khaligh-Razavi, S.-M. & Kriegeskorte, N. Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Comput. Biol. 10, e1003915 (2014).
Article PubMed PubMed Central Google Scholar
Seeliger, K. et al. End-to-end neural system identification with neural information flow. PLoS Comput. Biol. 17, e1008558 (2021).
Article CAS PubMed PubMed Central Google Scholar
Stansbury, D. E., Naselaris, T. & Gallant, J. L. Natural scene statistics account for the representation of scene categories in human visual cortex. Neuron 79, 1025–1034 (2013).
Article CAS PubMed PubMed Central Google Scholar
St-Yves, G. & Naselaris, T. The feature-weighted receptive field: an interpretable encoding model for complex feature spaces. Neuroimage 180, 188–202 (2018).
Yamins, D. L. K. et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl Acad. Sci. USA 111, 8619–8624 (2014).
Article CAS PubMed PubMed Central Google Scholar
Naselaris, T. et al. Cognitive computational neuroscience: a new conference for an emerging discipline. Trends Cogn. Sci. 22, 365–367 (2018).
Article PubMed PubMed Central Google Scholar
Chang, N. et al. BOLD5000, a public fMRI dataset while viewing 5000 visual images. Sci. Data 6, 49 (2019).
Article PubMed PubMed Central Google Scholar
Horikawa, T. & Kamitani, Y. Generic decoding of seen and imagined objects using hierarchical visual features. Nat. Commun. 8, 15037 (2017).
Article CAS PubMed PubMed Central Google Scholar
Kay, K. N., Naselaris, T., Prenger, R. J. & Gallant, J. L. Identifying natural images from human brain activity. Nature 452, 352–355 (2008).
Article CAS PubMed PubMed Central Google Scholar
Triantafyllou, C. et al. Comparison of physiological noise at 1.5 T, 3 T and 7 T and optimization of fMRI acquisition parameters. Neuroimage 26, 243–250 (2005).
Article CAS PubMed Google Scholar
Brady, T. F., Konkle, T., Alvarez, G. A. & Oliva, A. Visual long-term memory has a massive storage capacity for object details. Proc. Natl Acad. Sci. USA 105, 14325–14329 (2008).
Article CAS PubMed PubMed Central Google Scholar
Haxby, J. V., Guntupalli, J. S., Nastase, S. A. & Feilong, M. Hyperalignment: modeling shared information encoded in idiosyncratic cortical topographies. eLife 9, e56601 (2020).
Article CAS PubMed PubMed Central Google Scholar
Power, J. D., Lynch, C. J., Adeyemo, B. & Petersen, S. E. A critical, event-related appraisal of denoising in resting-state fMRI studies. Cereb. Cortex 30, 5544–5559 (2020).
Article PubMed Google Scholar
Roth, Z. N., Ryoo, M. & Merriam, E. P. Task-related activity in human visual cortex. PLoS Biol. 18, e3000921 (2020).
Article CAS PubMed PubMed Central Google Scholar
Benson, N. C. et al. The human connectome project 7 Tesla retinotopy dataset: description and population receptive field analysis. J. Vis. 18, 23 (2018).
Stigliani, A., Weiner, K. S. & Grill-Spector, K. Temporal processing capacity in high-level visual cortex is domain specific. J. Neurosci. 35, 12412–12424 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kay, K. et al. A critical assessment of data quality and venous effects in sub-millimeter fMRI. Neuroimage 189, 847–869 (2019).
Article PubMed Google Scholar
Gordon, E. M. et al. Precision functional mapping of individual human brains. Neuron 95, 791–807 (2017).
Article CAS PubMed PubMed Central Google Scholar
Kang, X., Yund, E. W., Herron, T. J. & Woods, D. L. Improving the resolution of functional brain imaging: analyzing functional data in anatomical space. Magn. Reson. Imaging 25, 1070–1078 (2007).
Article PubMed Google Scholar
Kay, K. N., Rokem, A., Winawer, J., Dougherty, R. F. & Wandell, B. GLMdenoise: a fast, automated technique for denoising task-based fMRI data. Front. Neurosci. 7, 247 (2013).
Article PubMed PubMed Central Google Scholar
Rokem, A. & Kay, K. Fractional ridge regression: a fast, interpretable reparameterization of ridge regression. Gigascience 9, giaa133 (2020).
Albrecht, D. G. & Hamilton, D. B. Striate cortex of monkey and cat: contrast response function. J. Neurophysiol. 48, 217–237 (1982).
Article CAS PubMed Google Scholar
Wagner, A. D., Shannon, B. J., Kahn, I. & Buckner, R. L. Parietal lobe contributions to episodic memory retrieval. Trends Cogn. Sci. 9, 445–453 (2005).
Article PubMed Google Scholar
Spaniol, J. et al. Event-related fMRI studies of episodic encoding and retrieval: meta-analyses using activation likelihood estimation. Neuropsychologia 47, 1765–1779 (2009).
Article PubMed Google Scholar
Gonzalez-Castillo, J. et al. Whole-brain, time-locked activation with simple tasks revealed using massive averaging and model-free analysis. Proc. Natl Acad. Sci. USA 109, 5487–5492 (2012).
Article CAS PubMed PubMed Central Google Scholar
Maaten, Lvander & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Google Scholar
Connolly, A. C. et al. The representation of biological classes in the human brain. J. Neurosci. 32, 2608–2618 (2012).
Article CAS PubMed PubMed Central Google Scholar
Naselaris, T., Stansbury, D. E. & Gallant, J. L. Cortical representation of animate and inanimate objects in complex natural scenes. J. Physiol. Paris 106, 239–249 (2012).
Article PubMed PubMed Central Google Scholar
Long, B., Yu, C.-P. & Konkle, T. Mid-level visual features underlie the high-level categorical organization of the ventral stream. Proc. Natl Acad. Sci. USA 115, E9015–E9024 (2018).
Article CAS PubMed PubMed Central Google Scholar
Henriksson, L., Khaligh-Razavi, S.-M., Kay, K. & Kriegeskorte, N. Visual representations are dominated by intrinsic fluctuations correlated between areas. Neuroimage 114, 275–286 (2015).
Article PubMed Google Scholar
Naselaris, T., Kay, K. N., Nishimoto, S. & Gallant, J. L. Encoding and decoding in fMRI. Neuroimage 56, 400–410 (2011).
Article PubMed Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25 https://papers.nips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html, 1097–1105 (2012).
Cadena, S. A. et al. Deep convolutional models improve predictions of macaque V1 responses to natural images. PLoS Comput. Biol. 15, e1006897 (2019).
Article CAS PubMed PubMed Central Google Scholar
Wang, A., Tarr, M. & Wehbe, L. Neural Taskonomy: Inferring the Similarity of Task-Derived Representations from Brain Activity. In Advances in Neural Information Processing Systems 32 https://papers.nips.cc/paper/2019/hash/f490c742cd8318b8ee6dca10af2a163f-Abstract.html, 15475–15485 (2019).
Sinz, F. H., Pitkow, X., Reimer, J., Bethge, M. & Tolias, A. S. Engineering a less artificial intelligence. Neuron 103, 967–979 (2019).
Article CAS PubMed Google Scholar
Aliko, S., Huang, J., Gheorghiu, F., Meliss, S. & Skipper, J. I. A naturalistic neuroimaging database for understanding the brain using ecological stimuli. Sci. Data 7, 347 (2020).
Article PubMed PubMed Central Google Scholar
Nastase, S. A., Liu, Y.-F., Hillman, H., Norman, K. A. & Hasson, U. Leveraging shared connectivity to aggregate heterogeneous datasets into a common response space. Neuroimage 217, 116865 (2020).
Article PubMed Google Scholar
Taylor, J. R. et al. The cambridge centre for ageing and neuroscience (Cam-CAN) data repository: structural and functional MRI, MEG, and cognitive data from a cross-sectional adult lifespan sample. Neuroimage 144, 262–269 (2017).
Article PubMed Google Scholar
Bellec, P. & Boyle, J. A. Bridging the gap between perception and action: the case for neuroimaging, AI and video games. Preprint at https://psyarxiv.com/3epws (2019).
Pinho, A. L. et al. Individual Brain Charting, a high-resolution fMRI dataset for cognitive mapping. Sci. Data 5, 180105 (2018).
Article PubMed PubMed Central Google Scholar
Poldrack, R. A. et al. Long-term neural and physiological phenotyping of a single human. Nat. Commun. 6, 8885 (2015).
Article CAS PubMed Google Scholar
Seeliger, K., Sommers, R. P., Güçlü, U., Bosch, S. E. & van Gerven, M. A. J. A large single-participant fMRI dataset for probing brain responses to naturalistic stimuli in space and time. Preprint at https://www.biorxiv.org/content/10.1101/687681v1 (2019).
Naselaris, T., Allen, E. & Kay, K. Extensive sampling for complete models of individual brains. Curr. Opin. Behav. Sci. 40, 45–51 (2021).
Article Google Scholar
Polimeni, J. R., Renvall, V., Zaretskaya, N. & Fischl, B. Analysis strategies for high-resolution UHF-fMRI data. Neuroimage 168, 296–320 (2018).
Article PubMed Google Scholar
Harms, M. P. et al. Extending the Human Connectome Project across ages: imaging protocols for the Lifespan Development and Aging projects. Neuroimage 183, 972–984 (2018).
Article PubMed Google Scholar
Power, J. D. et al. Customized head molds reduce motion during resting state fMRI scans. Neuroimage 189, 141–149 (2019).
Article PubMed Google Scholar
Brainard, D. H. The Psychophysics Toolbox. Spat. Vis. 10, 433–436 (1997).
Article CAS PubMed Google Scholar
Pelli, D. G. The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spat. Vis. 10, 437–442 (1997).
Article CAS PubMed Google Scholar
Caesar, H., Uijlings, J. & Ferrari, V. COCO-Stuff: Thing and Stuff classes in context. In IEEE/CVF Conf. Computer Vision and Pattern Recognition https://doi.ieeecomputersociety.org/10.1109/CVPR.2018.00132 1209–1218 (2018).
Schira, M. M., Tyler, C. W., Breakspear, M. & Spehar, B. The foveal confluence in human visual cortex. J. Neurosci. 29, 9050–9058 (2009).
Article CAS PubMed PubMed Central Google Scholar
Shahid, A., Wilkinson, K., Marcu, S. & Shapiro, C. M. Stanford Sleepiness Scale (SSS). In: STOP, THAT and One Hundred Other Sleep Scales (eds. Shahid, A., Wilkinson, K., Marcu, S. & Shapiro, C. M.) 369–370 (Springer, 2012).
Marks, D. F. Visual imagery differences in the recall of pictures. Br. J. Psychol. 64, 17–24 (1973).
Article CAS PubMed Google Scholar
Torgesen, J. K., Wagner, R. & Rashotte, C. TOWRE-2: Test of Word Reading Efficiency (Pearson, 2012).
Duchaine, B. & Nakayama, K. The Cambridge Face Memory Test: results for neurologically intact individuals and an investigation of its validity using inverted face stimuli and prosopagnosic participants. Neuropsychologia 44, 576–585 (2006).
Article PubMed Google Scholar
Tardif, J., Watson, M., Giaschi, D. & Gosselin, F. Measuring the contrast sensitivity function in just three clicks. J. Vis. 16, 966–966 (2016).
Article Google Scholar
Arora, S., Liang, Y. & Ma, T. A simple but tough-to-beat baseline for sentence embeddings. https://openreview.net/pdf?id=SyK00v5xx (2017).
Kriegeskorte, N. & Mur, M. Inverse MDS: inferring dissimilarity structure from multiple item arrangements. Front. Psychol. 3, 245 (2012).
Article PubMed PubMed Central Google Scholar
Kay, K., Jamison, K. W., Zhang, R.-Y. & Uğurbil, K. A temporal decomposition method for identifying venous effects in task-based fMRI. Nat. Methods 17, 1033–1039 (2020).
Article CAS PubMed PubMed Central Google Scholar
Avants, B. B. et al. A reproducible evaluation of ANTs similarity metric performance in brain image registration. Neuroimage 54, 2033–2044 (2011).
Article PubMed Google Scholar
Yushkevich, P. A. et al. User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage 31, 1116–1128 (2006).
Article PubMed Google Scholar
Esteban, O. et al. fMRIPrep: a robust preprocessing pipeline for functional MRI. Nat. Methods 16, 111–116 (2019).
Article CAS PubMed Google Scholar
Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L. & Petersen, S. E. Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. Neuroimage 59, 2142–2154 (2012).
Article PubMed Google Scholar
Handwerker, D. A., Gonzalez-Castillo, J., D’Esposito, M. & Bandettini, P. A. The continuing challenge of understanding and modeling hemodynamic variation in fMRI. Neuroimage 62, 1017–1023 (2012).
Article PubMed Google Scholar
Hoerl, A. E. & Kennard, R. W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 55–67 (1970).
Article Google Scholar
Kay, K. N., Winawer, J., Mezer, A. & Wandell, B. Compressive spatial summation in human visual cortex. J. Neurophysiol. 110, 481–494 (2013).
Article PubMed PubMed Central Google Scholar
Lage-Castellanos, A., Valente, G., Formisano, E. & De Martino, F. Methods for computing the maximum performance of computational models of fMRI responses. PLoS Comput. Biol. 15, e1006397 (2019).
Article PubMed PubMed Central Google Scholar
Biswal, B., Yetkin, F. Z., Haughton, V. M. & Hyde, J. S. Functional connectivity in the motor cortex of resting human brain using echo-planar MRI. Magn. Reson. Med. 34, 537–541 (1995).
Article CAS PubMed Google Scholar
Nili, H. et al. A toolbox for representational similarity analysis. PLoS Comput. Biol. 10, e1003553 (2014).
Article PubMed PubMed Central Google Scholar
Kriegeskorte, N., Mur, M. & Bandettini, P. Representational similarity analysis—connecting the branches of systems neuroscience. Front. Syst. Neurosci. 2, 4 (2008).
PubMed PubMed Central Google Scholar
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Gorgolewski, K. J. et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci. Data 3, 1–9 (2016).
Article Google Scholar
Cichy, R. M., Roig, G. & Oliva, A. The Algonauts Project. Nat. Mach. Intell. 1, 613 (2019).
Article Google Scholar

Download references

Acknowledgements

We thank the NSD participants for their time and endurance; E. Aminoff, J. Pyles, M. Tarr, M. Hebart and C. Baker for advice on experimental design and data collection; J. Power and A. Schapiro for consultation on resting-state and physiological data; V. Carr and R. Olsen for consultation on hippocampal subfield scanning protocols; A. Grant for assistance with scanner peripherals; F. Gosselin and J. Tardif for contrast sensitivity analysis; B. Klimes-Dougan and K. Cullen for designing the valence/arousal assessment; W. Guo for segmentations of the medial temporal lobe; M. Arcaro, A. Bratch, D. Finzi, A. White and J. Winawer for assistance with ROI definition; C. Gorgolewski and R. Poldrack for discussion of BIDS and data sharing; R. Cichy, E. Yacoub, K. Grill-Spector, K. Jamison, A. Rokem, A. Huth, S. Anzellotti, N. Kriegeskorte and J. Winawer for general discussions; and K. Ugurbil for overall project advice. We also thank our NSD collaborators for shaping the trajectory of the project. This work was supported by NSF CRCNS grants IIS-1822683 (K.K.) and IIS-1822929 (T.N.); NIH grants P41 EB015894, P30 NS076408, S10 RR026783 and S10 OD017974-01, the W. M. Keck Foundation and the NIMH Intramural Research Program ZIAMH002909 (M.N.); and NSF BCS-1734853, NIH NIBIB R01EB030896, NIH NIBIB R01EB029272 and NIH IIS-1912270 (F.P.).

Author information

Ghislain St-Yves & Thomas Naselaris
Present address: Department of Neuroscience, University of Minnesota, Minneapolis, MN, USA
Jesse L. Breedlove
Present address: Department of Psychology, University of Minnesota, Minneapolis, MN, USA
Jacob S. Prince
Present address: Department of Psychology, Harvard University, Cambridge, MA, USA
These authors jointly supervised this work: Thomas Naselaris, Kendrick Kay.

Authors and Affiliations

Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota, Minneapolis, MN, USA
Emily J. Allen & Kendrick Kay
Department of Psychology, University of Minnesota, Minneapolis, MN, USA
Emily J. Allen
Department of Neuroscience, Medical University of South Carolina, Charleston, SC, USA
Ghislain St-Yves, Jesse L. Breedlove & Thomas Naselaris
Graduate Program in Cognitive Science, University of Minnesota, Minneapolis, MN, USA
Yihan Wu
Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, USA
Jacob S. Prince
Department of Neuroscience, Center for Magnetic Resonance Research (CMRR), University of Minnesota, Minneapolis, MN, USA
Logan T. Dowdle
Department of Neurosurgery, Center for Magnetic Resonance Research (CMRR), University of Minnesota, Minneapolis, MN, USA
Logan T. Dowdle
National Institute of Mental Health (NIMH), Bethesda MD, USA
Matthias Nau
Program in Neuroscience, Indiana University, Bloomington IN, USA
Brad Caron
Program in Vision Science, Indiana University, Bloomington IN, USA
Brad Caron
Department of Psychology, University of Texas at Austin, Austin, TX, USA
Franco Pestilli
Center for Perceptual Systems, University of Texas at Austin, Austin, TX, USA
Franco Pestilli
Institute for Neuroscience, University of Texas at Austin, Austin, TX, USA
Franco Pestilli
Center for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK
Ian Charest
cerebrUM, Département de Psychologie, Université de Montréal, Montréal QC, Canada
Ian Charest
Department of Psychology, University of Oregon, Eugene, OR, USA
J. Benjamin Hutchinson

Authors

Emily J. Allen
View author publications
You can also search for this author in PubMed Google Scholar
Ghislain St-Yves
View author publications
You can also search for this author in PubMed Google Scholar
Yihan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jesse L. Breedlove
View author publications
You can also search for this author in PubMed Google Scholar
Jacob S. Prince
View author publications
You can also search for this author in PubMed Google Scholar
Logan T. Dowdle
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Nau
View author publications
You can also search for this author in PubMed Google Scholar
Brad Caron
View author publications
You can also search for this author in PubMed Google Scholar
Franco Pestilli
View author publications
You can also search for this author in PubMed Google Scholar
Ian Charest
View author publications
You can also search for this author in PubMed Google Scholar
J. Benjamin Hutchinson
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Naselaris
View author publications
You can also search for this author in PubMed Google Scholar
Kendrick Kay
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

E.J.A. collected the neuroimaging data, coordinated the data collection effort and performed manual brain segmentations. G.S.-Y. performed neural network analyses. Y.W. performed participant recruitment, assisted with scanning and prepared eye-tracking videos. J.L.B. assisted in data analysis. J.S.P. performed the equivalent trials analysis on the NSD and BOLD5000. L.T.D. organized and prepared data in BIDS format. M.N. analyzed the eye-tracking data. B.C. and F.P. analyzed the diffusion data. I.C. performed representational similarity analyses. J.B.H. analyzed the behavioral data. K.K. and T.N. conceived of the project and designed the main experiment. J.B.H. and I.C. designed the nsdmeadows and nsdmemory behavioral assessments. K.K. developed analysis methods, analyzed the neuroimaging data and directed the overall project. K.K., T.N., E.J.A., M.N., B.C., F.P., I.C. and J.B.H. wrote the paper. All authors discussed and edited the manuscript.

Corresponding author

Correspondence to Kendrick Kay.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Peer review information Nature Neuroscience thanks Evan Gordon, Andrew Zalesky, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Design of the NSD experiment.

a, Image presentations. Each of 10,000 distinct images was placed 3 times on a circle according to a probability distribution created by mixing a relatively narrow von Mises distribution and a uniform distribution. The resulting image sequence was divided into 40 equally-sized segments for the 40 NSD scan sessions. b, Basic statistics of image repetitions. We define novel trial as a trial involving an image never shown before, old trial as a trial that is not a novel trial, and easy trial as an old trial for which the presented image had been shown previously in the same scan session.

Extended Data Fig. 2 Overview of data collection.

This table summarizes the overall NSD data collection effort. Structural and diffusion MRI data were collected at 3T. Functional MRI data were collected at 7T. The breakdown of the 7T fMRI scan sessions is indicated: for example, subject 2 participated in 1 (prffloc) + 40 (nsd01–nsd40) + 1 (nsdsynthetic) + 1 (nsdimagery) = 43 7T fMRI scan sessions. Additional behavioral data were acquired outside of the scanner (nsdpostbehavior, nsdmemory, nsdmeadows). Note that scan sessions were occasionally split across multiple magnet entries (see aquamarine and yellow cells). For simplicity, we treat these cases as if they represent single scan sessions.

Extended Data Fig. 3 Overview of data analysis.

Analyses conducted in this paper can be divided into three parts. Part 1 consists of pre-processing, in which raw functional, anatomical, diffusion, and eyetracking data are transformed into various useful intermediate outcomes. In addition, coordinate transformations between various spaces are estimated and incorporated into the nsd_mapdata utility. Part 2 consists of analyses of the pre-processed fMRI data. The GLMsingle algorithm introduced in this paper is used to analyze the fMRI data from the NSD experiment (Part 2a), and standard methods are used to analyze the fMRI data from the pRF and fLoc experiments (Part 2b). Part 3 consists of specific scientific analyses demonstrated in this paper that make use of the data prepared in Parts 1 and 2. Given the extensive data preparation procedures (Parts 1–2), it is useful to comment on which aspects are fairly typical in MRI processing and which are more customized or unique to the present work. With respect to the pre-processing steps in Part 1, the general outcomes that these steps achieve are typical in MRI and are necessary for basic interpretation of the data. For example, small shifts in head position over the course of a scan session necessitate some motion compensation in order to interpret the signal from a given voxel in terms of a single brain location. The specific methods by which we execute these pre-processing steps may differ from what is performed in commonly used software packages (for example, SPM, FSL, AFNI). However, the outcomes are similar at a conceptual level: for example, the fMRI data are pre-processed using temporal interpolation of voxel-wise time-series data and spatial interpolation of brain volumes. With respect to the additional preparation procedures in Part 2, the procedures in Part 2b are fairly typical analyses used to functionally localize brain regions. More customized and unique to the present work are the procedures in Part 2a, which are designed to improve the accuracy of single-trial fMRI amplitude estimates. We provide evidence that these procedures do in fact perform as intended (see Fig. 3 and Extended Data Fig. 8).

Extended Data Fig. 4 Eyetracking results.

a, Pre-processing of eyetracking data. Blinks and tracking noise were removed, followed by linear detrending, median-centering, downsampling, and smoothing. Runs with less than 1/3 valid samples after these cleaning procedures were excluded from further analysis (see Supplementary Note 5). Shown are results for an example run (subject 1, nsd31 scan session, run 6). Pre-processing reduced noise without obscuring potential eye movements. b, Fraction of time during which deviation from central fixation was less than a specific threshold. Results are shown for a range of thresholds (left) and for a threshold of 1° (right). c, 2D histograms of gaze positions. The main images show histogram results on a linear scale; the inset images show results on a log scale. To summarize the results, we overlay a gray ellipse marking the central 90% of a multivariate 2D Gaussian distribution that has been fit to the gaze positions, as well as a blue circle containing 90% of the gaze positions. Both the parametric and non-parametric approaches yield similar results and indicate that gaze positions of all subjects clustered around central fixation. The level of precision varied across subjects. The number of usable eyetracking runs for each subject is indicated by the white text. d, Example of accurate fixation behavior (subject 1, nsd31 scan session, run 8). Shown are pre-processed vertical gaze coordinates (top left), normalized pupil area (bottom left), and a 2D scatter plot of gaze positions (right). e, Example of eye movements (subject 5, nsd29 scan session, run 11). Same format as d. Notice that eye movements manifest as staircase structure in the vertical gaze coordinates and as dispersed gaze positions in the scatter plot. f, Trial-wise time-resolved analysis. Relative to stimulus trial onsets, we plot the across-trial median deviation from central fixation (top), as well as the across-trial median pupil size after mean-centering the pupil size within each trial (bottom). Results for subjects 3 and 8 are not available for this analysis. Overall, the results show that subjects were able to maintain fixation most of the time: gaze positions were within 1° of central fixation 68–97% of the time (see b). Three subjects are worth further discussion. Subject 4 exhibited eye movements after stimulus onset (see f, top); however, this is of minor concern given that these movements were small. Subject 5 exhibited more substantial eye movements (see c, e, and f); we suggest exclusion of this subject from analyses of the NSD fMRI data that are contingent on strict central fixation. Finally, while our results indicate fixation instability for subject 8 (see b and c), careful inspection of the eyetracking video recordings (available online) suggests this reflects pupil tracking noise rather than actual eye movements made by the subject.

Extended Data Fig. 5 Improvements in spatial detail through upsampling.

a, Comparison of approaches. For an example coronal slice in Subject 1, we compare the non-upsampled 1.8-mm preparation of the data (left), the upsampled 1-mm preparation of the data (right), and a version of the 1.8-mm results that has been post-hoc upsampled to 1-mm resolution to enable direct comparison (middle). Two quantities are shown: mean signal intensity and variance explained by an ON-OFF GLM model. b, Zoomed view of white rectangle marked in a. c, Profile view of blue dotted horizontal line marked in b. Error bars in the bottom plot indicate ± 1 SEM across 40 scan sessions (error bars are small and nearly invisible). d, Timecourse estimates for voxels marked by orange arrowheads at the bottom of c. Each colored trace corresponds to an estimate of the hemodynamic timecourse for a single voxel in one NSD scan session from the upsampled 1-mm data preparation. The beginning of the timecourses (first vertical line) corresponds to the onset of the 3-s image presentation. The results shown in this figure support the idea that the upsampled data preparation preserves fine-scale spatial detail that is lost (blurred away) under a non-upsampled data preparation. While the effects are small, preserving as much detail as possible may be critical for certain neuroscientific questions.

Extended Data Fig. 6 Reliable diffusion derivatives facilitate investigation of white-matter connectivity.

a, Fractional anisotropy (FA). The left shows tractography and FA results for the optic radiation identified in subject 7. The right shows reliability of FA results for 61 white-matter tracts identified using the atlas from Bullock et al.¹¹⁴ For other measures, see Supplementary Fig. 5c–e. b, Structural connectivity. Using 43 visual areas × 2 hemispheres = 86 regions from the HCP-MMP1 atlas¹⁰⁹ (left), we construct group-average connectivity matrices indicating the density of fibers connecting pairs of regions (right). c, Quantitative summary. Each dot represents fiber density between a pair of regions (as in b). Dot colors reflect different region pairs but are otherwise arbitrary. Group-average results (main figure) and results for an individual subject (inset) are shown.

Extended Data Fig. 7 Regions of interest (ROIs) provided with NSD.

A variety of ROIs were defined based on auxiliary fMRI experiments (pRF, fLoc). In a–c, we show example results for subject 3, right hemisphere. a, Early visual areas. Results are shown on FreeSurfer’s sphere surface as well as in the 0.8-mm anatomical volume space. b, Eccentricity-based regions. Similar format to a. Note that the total stimulus extent is 8.4° × 8.4° in the pRF, fLoc, and NSD experiments. c, Face-selective regions. Regions were defined based on t-values computed for the contrast of faces against all other categories. Results are shown on FreeSurfer’s inflated surface as well as in the 0.8-mm anatomical space. d, Probabilistic maps of ROI locations. For each of three example ROIs, we map the location of the ROI in each subject to fsaverage and then compute, for each fsaverage vertex, the fraction of subjects labeled at that vertex. Notice there is reasonable consistency across subjects in fsaverage space.

Extended Data Fig. 8 Detailed visualization of NSD betas.

We prepared three beta versions (b1, b2, b3) reflecting GLM analyses of increasing sophistication. a, Inspection of NSD betas. The full set of estimated single-trial responses (1.8-mm preparation, beta version b1) is shown for voxels in subject 1 right hemisphere region of interest (ROI) FFA-1 (fusiform face area subdivision 1). We observe horizontal stripes, indicative of gross variation in percent BOLD signal change across voxels. b, Zoomed view of one scan session. Shown are all three beta versions, as well as the result of z-scoring betas within each scan session (in general, we suggest that users may wish to z-score each voxel’s responses within each scan session in order to eliminate potential non-stationarities and to equalize units across voxels). The different beta versions generally resemble one another (left column), implying that the variations in GLM methods do not drastically change the data. Vertical stripes visible in the visualizations tend to decrease from b1 to b2, suggesting that fitting voxel-wise HRFs reduces artifacts. Vertical stripes also tend to decrease from b2 to b3, which might reflect the reduction of correlated noise achieved by GLMdenoise. c, Detailed inspection of one voxel. To assess the reliability of evoked responses, we group trials according to the image presented. The estimated signal standard deviation (σ_signal) and noise standard deviation (σ_noise) are illustrated at the right of each subplot. Notice that b2 and b3 reduce variability of betas across the 3 trials associated with each image. d, Response reliability. Here we plot single-trial responses observed in two example ROIs (1.8-mm preparation, beta version b2, right hemisphere FFA-1 and PPA (parahippocampal place area), response averaged across voxels in each ROI), showing the first 50 of the shared515 images. The left column shows responses for different trials in subject 1; the right column shows trial-averaged responses in different subjects. Lines connecting consecutive images are used to aid visualization but do not indicate specific temporal relationships between images. Thick black lines indicate the mean across trials (left) or subjects (right). Notice that reliability is reasonably high both within and across subjects. e, Quantitative summary. To summarize results shown in d, we plot the correlation between responses to the shared515 images across all trials and all subjects. Thin white horizontal and vertical lines separate different subjects (each having 3 trials). Notice there is high reliability within each ROI, and responses are highly dissimilar across ROIs. The strong off-diagonal elements (white arrows) indicate the presence of spatial noise correlations that occur on individual trials, which is typical in fMRI⁴⁵. Noise correlations likely reflect a combination of measurement noise (for example, head motion) and real neural activity variability (for example, arousal effects). In some cases, correlations are larger across subjects than within subjects; one explanation is that there is, to some degree, a common ROI representation and a noisy measurement of this representation obtained in one subject might actually be better correlated with a less noisy measurement of this representation obtained in a different subject. Also, the results indicate the existence of temporal ordering effects (for example, trial 1 in a given subject tends to be more correlated with trial 1 in other subjects as opposed to trials 2 or 3). This likely indicates the presence of adaptation- and/or memory-related effects in the NSD data, given that the temporal ordering of trials was fixed across subjects.

Extended Data Fig. 9 Angle and eccentricity estimates from the NSD data.

Here we show results from the analysis of the pRF experiment and results from an analogous analysis performed on trial-averaged NSD betas (see Supplementary Modeling Note 1 for details). Each panel shows an occipital view of FreeSurfer’s sphere surface, and white lines indicate borders of visual areas V1–hV4 (defined based on results of the pRF experiment). Angle and eccentricity estimates are plotted using the same colormaps as in Benson et al.³⁰ We also plot the amount of time-series variance explained in the pRF data (variance relative to the mean signal level) and the amount of variance explained in the NSD betas (variance relative to 0% BOLD signal change). Clear retinotopic maps in early visual cortex are visible in the NSD results, including robust angle estimates even in foveal regions. In addition, there is high consistency of retinotopic estimates across the pRF and NSD datasets. There is some discrepancy in absolute eccentricity estimates at peripheral locations; this is likely due to technical differences in how modeling procedures behave for voxels near the stimulus edge.

Extended Data Fig. 10 Design of AlexNet- and GNet-based encoding models.

a, Illustration of an encoding model that predicts brain activity in a given voxel (r_tv) in response to images (x_t). Images are passed to nonlinear feature extractors, η_l (trapezoids), that output feature maps (grey cuboids). Feature maps are grouped, passed through an element-wise nonlinearity, f(·), and then multiplied pixel-wise by a spatial pooling field (g¹,…,g^N where superscripts index distinct groups of feature maps) that determines the region of visual space that drives voxel activity. The weighted pixel values in each feature map are then summed, reducing each feature map to a scalar value. These scalar values are concatenated across all feature maps, forming a single feature vector that is passed through another element-wise nonlinearity (left black rectangle) and then weighted by a set of feature weights, w (right black rectangle), to yield predicted voxel activity. Note that for each type of encoding model (for example, AlexNet-based encoding model, GNet-based encoding model), the feature extractors are identical for all voxels, but the spatial pooling fields and feature weights are optimized and may vary across voxels. For the AlexNet-based encoding model, the feature extractors were pre-specified, the spatial pooling fields were optimized via line search, and the feature weights w were optimized via ridge regression. For the GNet-based encoding model, stochastic gradient descent with early stopping was used to optimize the parameters of the feature extractors η_l, the spatial pooling fields g¹,…,g^N, and the feature weights w. b, Illustration of spatial pooling fields. For the AlexNet model, a single isotropic 2D Gaussian pooling field (middle) selected from a set of candidates (right) was applied to all feature maps. For the GNet model, an independent, flexible pooling field (left) was applied to each group of feature maps. Applying flexible pooling fields to AlexNet leads to lower prediction accuracy overall, so we present the version that uses isotropic 2D Gaussian fields. c, Comparative architecture of AlexNet and GNet. AlexNet and GNet are both deep convolutional neural networks, but differ in the types and sequencing of layers (rows of the table). The first three layers are the same for both networks and correspond to the first three layers of an AlexNet trained to classify objects in the ImageNet dataset. For both networks, these shared ‘pre-filtering’ layers are followed by sequences of convolutional layers (rows labeled ‘conv’; values indicate feature depth and convolutional filter resolution; ‘str’ = filter stride, ‘pad’ = convolutional padding), max-pooling layers (‘maxpool’), batch-normalization and weight-dropout layers (‘batchnorm + dropout’), adaptive averaging layers (‘adaptive avg’), and fully-connected layers (‘fully con.’; value indicates number of units). Feature maps in the convolutional or fully connected layers (indicated by red arrows; resolution of the feature maps in parentheses) are used as predictors of brain activity in the context of an encoding model (see a).

Supplementary information

Supplementary Information

Supplementary Notes 1–5, Supplementary Modeling Notes 1 and 2, Supplementary Figs.1–7, Supplementary Videos 1–10 and Supplementary References

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Allen, E.J., St-Yves, G., Wu, Y. et al. A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence. Nat Neurosci 25, 116–126 (2022). https://doi.org/10.1038/s41593-021-00962-x

Download citation

Received: 01 March 2021
Accepted: 12 October 2021
Published: 16 December 2021
Issue Date: January 2022
DOI: https://doi.org/10.1038/s41593-021-00962-x

This article is cited by

Elevating the field for applying neuroimaging to individual patients in psychiatry
- David R. Roalf
- Martijn Figee
- Desmond J. Oathes
Translational Psychiatry (2024)
What comparing deep neural networks can teach us about human vision
- Katja Seeliger
- Martin N. Hebart
Nature Machine Intelligence (2024)
Driving and suppressing the human language network using large language models
- Greta Tuckute
- Aalok Sathe
- Evelina Fedorenko
Nature Human Behaviour (2024)
Mind-bridge: reconstructing visual images based on diffusion model from human brain activity
- Qing Liu
- Hongqing Zhu
- Ying Wang
Signal, Image and Video Processing (2024)
brainlife.io: a decentralized and open-source cloud platform to support neuroscience research
- Soichi Hayashi
- Bradley A. Caron
- Franco Pestilli
Nature Methods (2024)