Abstract
The meaning of language is represented in regions of the cerebral cortex collectively known as the ‘semantic system’. However, little of the semantic system has been mapped comprehensively, and the semantic selectivity of most regions is unknown. Here we systematically map semantic selectivity across the cortex using voxel-wise modelling of functional MRI (fMRI) data collected while subjects listened to hours of narrative stories. We show that the semantic system is organized into intricate patterns that seem to be consistent across individuals. We then use a novel generative model to create a detailed semantic atlas. Our results suggest that most areas within the semantic system represent information about specific semantic domains, or groups of related concepts, and our atlas shows which domains are represented in each area. This study demonstrates that data-driven methods—commonplace in studies of human neuroanatomy and functional connectivity—provide a powerful and efficient means for mapping functional representations in the brain.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout



Similar content being viewed by others
References
Binder, J. R., Desai, R. H., Graves, W. W. & Conant, L. L. Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cereb. Cortex 19, 2767–2796 (2009)
Lerner, Y., Honey, C. J., Silbert, L. J. & Hasson, U. Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. J. Neurosci. 31, 2906–2915 (2011)
Friederici, A. D., Opitz, B. & von Cramon, D. Y. Segregating semantic and syntactic aspects of processing in the human brain: an fMRI investigation of different word types. Cereb. Cortex 10, 698–705 (2000)
Noppeney, U. & Price, C. J. Retrieval of abstract semantics. Neuroimage 22, 164–170 (2004)
Binder, J. R., Westbury, C. F., McKiernan, K. A., Possing, E. T. & Medler, D. A. Distinct brain systems for processing concrete and abstract concepts. J. Cogn. Neurosci. 17, 905–917 (2005)
Bedny, M., Caramazza, A., Grossman, E., Pascual-Leone, A. & Saxe, R. Concepts are more than percepts: the case of action verbs. J. Neurosci. 28, 11347–11353 (2008)
Saxe, R. & Kanwisher, N. People thinking about thinking people. The role of the temporo-parietal junction in “theory of mind”. Neuroimage 19, 1835–1842 (2003)
Caramazza, A. & Shelton, J. R. Domain-specific knowledge systems in the brain the animate-inanimate distinction. J. Cogn. Neurosci. 10, 1–34 (1998)
Mummery, C. J., Patterson, K., Hodges, J. R. & Price, C. J. Functional neuroanatomy of the semantic system: divisible by what? J. Cogn. Neurosci. 10, 766–777 (1998)
Just, M. A., Cherkassky, V. L., Aryal, S. & Mitchell, T. M. A neurosemantic theory of concrete noun representation based on the underlying brain codes. PLoS ONE 5, e8622 (2010)
Warrington, E. K. The selective impairment of semantic memory. Q. J. Exp. Psychol. 27, 635–657 (1975)
Mitchell, T. M. et al. Predicting human brain activity associated with the meanings of nouns. Science 320, 1191–1195 (2008)
Damasio, H., Grabowski, T. J., Tranel, D., Hichwa, R. D. & Damasio, A. R. A neural basis for lexical retrieval. Nature 380, 499–505 (1996)
Huth, A. G., Nishimoto, S., Vu, A. T. & Gallant, J. L. A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 76, 1210–1224 (2012)
Wehbe, L. et al. Simultaneously uncovering the patterns of brain regions involved in different story reading subprocesses. PLoS ONE 9, e112575 (2014)
Naselaris, T., Prenger, R. J., Kay, K. N., Oliver, M. & Gallant, J. L. Bayesian reconstruction of natural images from human brain activity. Neuron 63, 902–915 (2009)
Nishimoto, S. et al. Reconstructing visual experiences from brain activity evoked by natural movies. Curr. Biol. 21, 1641–1646 (2011)
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K. & Harshman, R. Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 391–407 (1990)
Lund, K. & Burgess, C. Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Methods Instrum. Comput. 28, 203–208 (1996)
Turney, P. D. & Pantel, P. From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37, 141–188 (2010)
Caramazza, A. & Mahon, B. Z. The organisation of conceptual knowledge in the brain: the future’s past and some future directions. Cogn. Neuropsychol. 23, 13–38 (2006)
Huth, A. G., Griffiths, T. L., Theunissen, F. E. & Gallant, J. L. PrAGMATiC: a probabilistic and generative model of areas tiling the cortex . Preprint at http://arxiv.org/abs/1504.03622 (2015)
Amunts, K., Malikovic, A., Mohlberg, H., Schormann, T. & Zilles, K. Brodmann’s areas 17 and 18 brought into stereotaxic space—where and how variable? Neuroimage 11, 66–84 (2000)
Fedorenko, E., Hsieh, P.-J., Nieto-Castañón, A., Whitfield-Gabrieli, S. & Kanwisher, N. New method for fMRI investigations of language: defining ROIs functionally in individual subjects. J. Neurophysiol. 104, 1177–1194 (2010)
Hinton, G. E. Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002)
Buckner, R. L., Andrews-Hanna, J. R. & Schacter, D. L. The brain’s default network: anatomy, function, and relevance to disease. Ann. NY Acad. Sci . 1124, 1–38 (2008)
DeWitt, I. & Rauschecker, J. P. Phoneme and word recognition in the auditory ventral stream. Proc. Natl Acad. Sci. USA 109, E505–E514 (2012)
Riesenhuber, M. Appearance isn’t everything: news on object representation in cortex. Neuron 55, 341–344 (2007)
Dehaene, S., Cohen, L., Sigman, M. & Vinckier, F. The neural code for written words: a proposal. Trends Cogn. Sci. 9, 335–341 (2005)
Op de Beeck, H. P., Haushofer, J. & Kanwisher, N. G. Interpreting fMRI data: maps, modules and dimensions. Nature Rev. Neurosci. 9, 123–135 (2008)
Caspers, S. et al. Organization of the human inferior parietal lobule based on receptor architectonics. Cereb. Cortex 23, 615–628 (2013)
Cohen, A. L. et al. Defining functional areas in individual human brains using resting functional connectivity MRI. Neuroimage 41, 45–57 (2008)
Yuan, J. & Liberman, M. Speaker identification on the SCOTUS corpus. Proc. Acoust. Preprint at http://www.ling.upenn.edu/~jiahong/publications/c09.pdf (2008)
Boersma, P. & Weenink, D. Praat: doing phonetics by computer (University of Amsterdam, 2014)
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. B 57, 289–300 (1995)
Oliphant, T. E. Guide to NumPy (Brigham Young University, 2006)
Jones, E., Oliphant, T. E. & Peterson, P. SciPy: Open source scientific tools for Python (SciPy, 2001)
Gao, J. S., Huth, A. G., Lescroart, M. D. & Gallant, J. L. Pycortex: an interactive surface visualizer for fMRI. Front. Neuroinform . 9, 23 (2015)
Acknowledgements
This work was supported by grants from the National Science Foundation (NSF; IIS1208203), the National Eye Institute (EY019684), and from the Center for Science of Information (CSoI), an NSF Science and Technology Center, under grant agreement CCF-0939370. A.G.H. was also supported by the William Orr Dingwall Neurolinguistics Fellowship. We thank J. Sohl-Dickstein and K. Crane for technical discussions about PrAGMATiC, J. Nguyen for assistance transcribing and aligning stimuli, B. Griffin for segmenting and flattening cortical surfaces, and N. Bilenko, J. Gao, M. Lescroart and A. Nunez-Elizalde for general comments and discussions.
Author information
Authors and Affiliations
Contributions
All authors helped conceive and design the experiment. W.A.d.H. and A.G.H. selected and annotated stimuli and collected fMRI data. A.G.H. analysed the data. A.G.H. and T.L.G. designed the PrAGMATiC generative model. A.G.H. and J.L.G. wrote the paper. J.L.G. contributed to all aspects of the project.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Extended data figures and tables
Extended Data Figure 1 Voxel-wise model prediction performance.
Cortical flatmaps showing prediction performance of voxel-wise semantic models for all seven subjects, formatted similarly to Fig. 1c. Models were tested using one 10-min story that was not included during model estimation. Prediction performance was then computed as the correlation between predicted and measured BOLD responses. Left column, raw prediction performance. Note that the colourmap here is scaled 0–1 rather than 0–0.6 as in Fig. 1c to match the scale of the adjusted prediction performance maps. Right column, prediction performance corrected to account for different amounts of noise in the BOLD responses (see Supplementary Methods for details). The voxel-wise semantic models predict BOLD responses in many brain areas, including SPFC, IPFC, LTC, VTC, LPC and MPC. These same regions have been previously identified as the semantic system in the human brain.
Extended Data Figure 2 Amount of variance explained by individual subject and group semantic dimensions.
Principal components analysis was used to discover the most important semantic dimensions from voxel-wise semantic model weights in each subject. To reduce noise, we used only the 10,000 best voxels in each subject, determined by cross-validation within the model estimation data set. Here we show the amount of variance explained in the semantic model weights by each of the 20 most important principal components (PCs). Orange lines show the amount of variance explained by each subject’s own PCs, blue lines show the variance explained by the PCs of combined data from the other six subjects, and grey lines show the variance explained by the PCs of the stories. (The Gale–Shapley stable marriage algorithm was used to re-order the group and stimulus PCs to maximize their correlation with the subject’s PCs.) Error bars indicate 99% confidence intervals. Confidence intervals for the subjects’ own PCs and group PCs are very small. Hollow markers indicate subject or group PCs that explain significantly more variance than the corresponding stimulus PCs (P < 0.001, bootstrap test). Six PCs explain significantly more variance in one out of seven subjects, five PCs in two subjects, four PCs in three subjects, and three PCs in one subject. Thus, four PCs seem to comprise a semantic space that is common across most individuals.
Extended Data Figure 3 Separate cortical projections of semantic dimensions 1–4 on subject S2 and combined cortical projections of dimensions 1–3 for subjects S1, S3 and S4.
a, Voxel-wise semantic model weights for subject S2 were projected onto each of the common semantic dimensions defined by PCs 1–4. Voxels for which model generalization performance was not significantly greater than zero (q(FDR) > 0.05) are shown in grey. Positive projections are shown in red, negative projections in blue and near-zero projections in white. Voxels with fMRI signal dropout due to field inhomogeneity are shaded with black hatched lines. b. Like Fig. 2b, c, this panel shows the result of projecting voxel-wise models onto the first three common semantic dimensions, and then colouring each voxel using an RGB colourmap. The red colour component corresponds to the projection on the first PC, the green component to the second, and the blue component to the third. Semantic information seems to be represented in complex patterns distributed across the semantic system and the patterns seem to be largely conserved across individuals.
Extended Data Figure 4 PrAGMATiC atlas likelihood maps.
Comparison of actual semantic maps (Fig. 2, Extended Data Fig. 3) to the maps generated from the PrAGMATiC atlas (Fig. 3). PrAGMATiC atlases for the left and right hemispheres were fit using data from all seven subjects. The left hemisphere atlas has 192 total areas and the right hemisphere has 128 (including non-semantic areas). Here we show the actual semantic maps for four subjects (first column), the PrAGMATiC atlas on each subject’s cortical surface (second column), the log likelihood ratio of the actual semantic map under the PrAGMATiC atlas versus a null model (third column), and the fraction of variance in the semantic map that the PrAGMATiC atlas explains for each location on the cortical surface (fourth column). The likelihood ratio maps show that most areas where there are large semantic model weights (that is, the semantic system) are much better explained by PrAGMATiC than by a null model and thus appear red, while areas where the weights are small (that is, somatomotor cortex, visual cortex, and so on) are about equally well explained by both PrAGMATiC and the null model and thus appear white. Variance explained was computed by subtracting the PrAGMATiC atlas from the actual semantic map (in the space of the four group semantic dimensions), squaring and summing the residuals and then dividing by the sum of squares in the actual map. The variance explained maps show that the PrAGMATiC atlas captures a large fraction of the variance in the semantic maps (37–47% in total).
Extended Data Figure 5 Comparison of PrAGMATiC models fit with different initial conditions.
As with many clustering algorithms, PrAGMATiC optimizes a non-convex objective function and so can find many potential locally optimal solutions. To reduce the effect of non-convexity on our results, we re-fit the model ten times (each time with a different random initialization), and then selected the model fit that yielded the best likelihood (that is, performance on the training set) as the PrAGMATiC atlas (Fig. 3). Here we show the PrAGMATiC atlas (top) and the second best model out of the ten that were estimated (bottom). The parcellations given by these two models are very similar. However, there are a few differences, which illustrate uncertainty in the model. Some of these differences are due to statistical thresholding: a few areas that were found to be significantly semantically selective in the best model are missing in the alternative model (see left medial prefrontal cortex), and some significant areas in the alternate model are missing from the best model (left ventral occipital cortex). Other differences suggest alternative parcellations for a few regions, where, for example, the same region of cortex is parcellated into three areas in the best model and four areas in the alternative model. Yet it is clear that none of the differences between these two models are sufficient to change any of the interpretations given in the main text.
Extended Data Figure 6 Semantic atlas for the LPC.
The PrAGMATiC atlas divides the LPC into 15 areas in the left hemisphere and 13 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the LPC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average predicted response of each area to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely this 12-category interpretation captures the average semantic model in each area. The LPC appears to be organized around the angular gyrus (AG), with a core that is selective for social, emotional and mental concepts (L6, 7, 9, 11; R5, 7) and a periphery that is selective for visual, tactile and numeric concepts (L2, 4, 5, 8, 10, 15; R6, 11).
Extended Data Figure 7 Semantic atlas for the MPC.
The PrAGMATiC atlas divides the MPC into 14 areas in the left hemisphere and 10 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the MPC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average predicted response of each area to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. Like the LPC, the MPC appears to be organized around a core group of areas that are selective for social and mental concepts (L6, 8, 10; R6, 7). Dorsolateral MPC areas (L2, 4; R1) are selective for visual and tactile concepts. Anterior dorsal areas (L5, 9; R4, 9) are selective for temporal concepts. Ventral areas (L11, 12, 14; R8) are selective for professional, temporal and locational concepts. Just above the retrosplenial cortex one distinct area in each hemisphere is selective for mental, professional and temporal concepts (L7; R3). Overall, the right MPC responds more than the left MPC to mental concepts.
Extended Data Figure 8 Semantic atlas for the SPFC.
The PrAGMATiC atlas divides the SPFC into 18 areas in the left hemisphere and 19 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the SPFC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. The organization in the SPFC seems to follow the long rostro-caudal sulci and gyri of the dorsal frontal lobe. Posterior–lateral SPFC areas (L4, 6; R6, 9, 11) are selective for social, emotional, communal and violent concepts. Posterior superior frontal sulcus areas (L2, 3, 7, 8; R1, 5, 7) are selective for visual, tactile and numeric concepts. The superior frontal gyrus contains a long strip of areas (L1, 5, 10, 12–15; R8, 12, 14–16) selective for social, emotional, communal and violent concepts.
Extended Data Figure 9 Semantic atlas for the LTC.
The PrAGMATiC atlas divides the LTC into eight areas in both the left and right hemispheres. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the LTC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. Anterior LTC areas (L4–8; R3–8) are selective for social, emotional, mental and violent concepts. Posterior LTC areas (L1–3; R1–2) are selective for numeric, tactile and visual concepts.
Extended Data Figure 10 Semantic atlas for the VTC.
The PrAGMATiC atlas divides the VTC into six areas in the left hemisphere and one area in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the VTC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. The VTC is relatively homogeneous: all areas are selective for numeric, tactile and visual concepts. Left VTC areas close to the parahippocampal place area (PPA) are also selective for locational concepts (L5–6).
Extended Data Figure 11 Semantic atlas for the IPFC.
The PrAGMATiC atlas divides the IPFC into 12 areas in the left hemisphere and 9 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the IPFC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. Posterior IPFC areas in the precentral sulcus (L1–3; R1, 2) are selective for visual, tactile and numeric concepts. Areas on the inferior frontal gyrus (L8; R4, 7) are selective for social and violent concepts. Areas in the inferior frontal sulcus and anterior middle frontal gyrus (L4–7; R5–6) are selective for visual, tactile and numeric concepts. Areas in the orbitofrontal sulci (L10; R9) are also selective for visual, tactile, numeric and locational concepts.
Extended Data Figure 12 Semantic atlas for the opercular and insular cortex.
The PrAGMATiC atlas divides the opercular and insular cortex (OIC) into four areas in the left hemisphere and three areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the OIC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. These areas are homogeneously selective for abstract concepts, with more posterior and superior areas also responding to emotional, communal and mental concepts.
Supplementary information
Supplementary Information
This file contains Supplementary Data, Supplementary methods, Supplementary Tables 1-3 and Supplementary References. (PDF 1330 kb)
Rights and permissions
About this article
Cite this article
Huth, A., de Heer, W., Griffiths, T. et al. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532, 453–458 (2016). https://doi.org/10.1038/nature17637
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nature17637
This article is cited by
-
Listening in to perceived speech with contrastive learning
Nature Machine Intelligence (2023)
-
A natural language fMRI dataset for voxelwise encoding models
Scientific Data (2023)
-
Hierarchical organization of language predictions in the brain
Nature Human Behaviour (2023)
-
A large dataset of semantic ratings and its computational extension
Scientific Data (2023)
-
Semantic reconstruction of continuous language from non-invasive brain recordings
Nature Neuroscience (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.