Natural speech reveals the semantic maps that tile human cerebral cortex

Journal name:
Nature
Volume:
532,
Pages:
453–458
Date published:
DOI:
doi:10.1038/nature17637
Received
Accepted
Published online

Abstract

The meaning of language is represented in regions of the cerebral cortex collectively known as the ‘semantic system’. However, little of the semantic system has been mapped comprehensively, and the semantic selectivity of most regions is unknown. Here we systematically map semantic selectivity across the cortex using voxel-wise modelling of functional MRI (fMRI) data collected while subjects listened to hours of narrative stories. We show that the semantic system is organized into intricate patterns that seem to be consistent across individuals. We then use a novel generative model to create a detailed semantic atlas. Our results suggest that most areas within the semantic system represent information about specific semantic domains, or groups of related concepts, and our atlas shows which domains are represented in each area. This study demonstrates that data-driven methods—commonplace in studies of human neuroanatomy and functional connectivity—provide a powerful and efficient means for mapping functional representations in the brain.

At a glance

Figures

  1. Voxel-wise modelling.
    Figure 1: Voxel-wise modelling.

    a, Seven subjects listened to over 2 h of naturally spoken narrative stories while BOLD responses were measured using fMRI. Each word in the stories was projected into a 985-dimensional word embedding space constructed using word co-occurrence statistics from a large corpus of text. A finite impulse response (FIR) regression model was estimated individually for every voxel. The voxel-wise model weights describe how words appearing in the stories influence BOLD signals. b, Models were tested using one 10-min story that was not included during model estimation. Model prediction performance was computed as the correlation between predicted responses to this story and actual BOLD responses. c, Prediction performance of voxel-wise models for one subject. Semantic models accurately predict BOLD responses in many brain areas, including the LTC, VTC, LPC, MPC, SPFC and IPFC. These regions have previously been identified as the semantic system in the human brain. LH, left hemisphere; RH, right hemisphere.

  2. Principal components of voxel-wise semantic models.
    Figure 2: Principal components of voxel-wise semantic models.

    ac, Principal components analysis of voxel-wise model weights reveals four important semantic dimensions in the brain (Extended Data Fig. 2). a, An RGB colourmap was used to colour both words and voxels based on the first three dimensions of the semantic space. Words that best match the four semantic dimensions were found and then collapsed into 12 categories using k-means clustering. Each category (Supplementary Table 2) was manually assigned a label. The 12 category labels (large words) and a selection of the 458 best words (small words) are plotted here along four pairs of semantic dimensions. The largest axis of variation lies roughly along the first dimension, and separates perceptual and physical categories (tactile, locational) from human-related categories (social, emotional, violent). PC, principal component. b, Voxel-wise model weights were projected onto the semantic dimensions and then coloured using the same RGB colourmap (see Extended Data Fig. 3 for separate dimensions). Projections for one subject (S2) are shown on that subject’s cortical surface. Semantic information seems to be represented in intricate patterns across much of the semantic system. c, Semantic principal component flatmaps for three other subjects. Comparing these flatmaps, many patterns appear to be shared across individuals. (See Extended Data Fig. 3 for other subjects.) Abbreviations for regions of interest are listed in the Methods section.

  3. PrAGMATiC: a generative model for cortical maps.
    Figure 3: PrAGMATiC: a generative model for cortical maps.

    ac, To create an atlas that describes the distribution of semantically selective functional areas in the human cerebral cortex we developed PrAGMATiC, a probabilistic and generative model of areas tiling the cortex. a, PrAGMATiC has two parts: an arrangement model and an emission model. The arrangement model is analogous to a physical system of springs joining neighbouring area centroids. To enforce similarity across subjects, springs also join areas to 19 regions of interest that were localized separately. The emission model assigns the functional mean of the closest area centroid to each point on the cortex, forming a Voronoi tessellation. Spring lengths and area means are shared across subjects while exact area locations are unique to each subject. These parameters are fit using maximum-likelihood estimation. b, A leave-one-out procedure was used to choose the number of areas in each hemisphere. PrAGMATiC models were estimated on six subjects and then used to predict BOLD responses for the seventh. Prediction performance improved significantly up to 192 total areas in the left hemisphere and 128 areas in the right. c, A semantic atlas was estimated using data from all seven subjects. Areas for which the semantic model did not predict better than models based on low-level features (that is, word rate, phonemes) were removed. The remaining areas were plotted on one subject’s cortical surface using the same RGB colourmap as Fig. 2. Areas dominated by signal dropout are shown in black hatching, and areas where the low-level models performed well are shown in white hatching. This atlas shows the functional organization of the semantic system that is common across subjects.

  4. Voxel-wise model prediction performance.
    Extended Data Fig. 1: Voxel-wise model prediction performance.

    Cortical flatmaps showing prediction performance of voxel-wise semantic models for all seven subjects, formatted similarly to Fig. 1c. Models were tested using one 10-min story that was not included during model estimation. Prediction performance was then computed as the correlation between predicted and measured BOLD responses. Left column, raw prediction performance. Note that the colourmap here is scaled 0–1 rather than 0–0.6 as in Fig. 1c to match the scale of the adjusted prediction performance maps. Right column, prediction performance corrected to account for different amounts of noise in the BOLD responses (see Supplementary Methods for details). The voxel-wise semantic models predict BOLD responses in many brain areas, including SPFC, IPFC, LTC, VTC, LPC and MPC. These same regions have been previously identified as the semantic system in the human brain.

  5. Amount of variance explained by individual subject and group semantic dimensions.
    Extended Data Fig. 2: Amount of variance explained by individual subject and group semantic dimensions.

    Principal components analysis was used to discover the most important semantic dimensions from voxel-wise semantic model weights in each subject. To reduce noise, we used only the 10,000 best voxels in each subject, determined by cross-validation within the model estimation data set. Here we show the amount of variance explained in the semantic model weights by each of the 20 most important principal components (PCs). Orange lines show the amount of variance explained by each subject’s own PCs, blue lines show the variance explained by the PCs of combined data from the other six subjects, and grey lines show the variance explained by the PCs of the stories. (The Gale–Shapley stable marriage algorithm was used to re-order the group and stimulus PCs to maximize their correlation with the subject’s PCs.) Error bars indicate 99% confidence intervals. Confidence intervals for the subjects’ own PCs and group PCs are very small. Hollow markers indicate subject or group PCs that explain significantly more variance than the corresponding stimulus PCs (P < 0.001, bootstrap test). Six PCs explain significantly more variance in one out of seven subjects, five PCs in two subjects, four PCs in three subjects, and three PCs in one subject. Thus, four PCs seem to comprise a semantic space that is common across most individuals.

  6. Separate cortical projections of semantic dimensions 1–4 on subject S2 and combined cortical projections of dimensions 1–3 for subjects S1, S3 and S4.
    Extended Data Fig. 3: Separate cortical projections of semantic dimensions 1–4 on subject S2 and combined cortical projections of dimensions 1–3 for subjects S1, S3 and S4.

    a, Voxel-wise semantic model weights for subject S2 were projected onto each of the common semantic dimensions defined by PCs 1–4. Voxels for which model generalization performance was not significantly greater than zero (q(FDR) > 0.05) are shown in grey. Positive projections are shown in red, negative projections in blue and near-zero projections in white. Voxels with fMRI signal dropout due to field inhomogeneity are shaded with black hatched lines. b. Like Fig. 2b, c, this panel shows the result of projecting voxel-wise models onto the first three common semantic dimensions, and then colouring each voxel using an RGB colourmap. The red colour component corresponds to the projection on the first PC, the green component to the second, and the blue component to the third. Semantic information seems to be represented in complex patterns distributed across the semantic system and the patterns seem to be largely conserved across individuals.

  7. PrAGMATiC atlas likelihood maps.
    Extended Data Fig. 4: PrAGMATiC atlas likelihood maps.

    Comparison of actual semantic maps (Fig. 2, Extended Data Fig. 3) to the maps generated from the PrAGMATiC atlas (Fig. 3). PrAGMATiC atlases for the left and right hemispheres were fit using data from all seven subjects. The left hemisphere atlas has 192 total areas and the right hemisphere has 128 (including non-semantic areas). Here we show the actual semantic maps for four subjects (first column), the PrAGMATiC atlas on each subject’s cortical surface (second column), the log likelihood ratio of the actual semantic map under the PrAGMATiC atlas versus a null model (third column), and the fraction of variance in the semantic map that the PrAGMATiC atlas explains for each location on the cortical surface (fourth column). The likelihood ratio maps show that most areas where there are large semantic model weights (that is, the semantic system) are much better explained by PrAGMATiC than by a null model and thus appear red, while areas where the weights are small (that is, somatomotor cortex, visual cortex, and so on) are about equally well explained by both PrAGMATiC and the null model and thus appear white. Variance explained was computed by subtracting the PrAGMATiC atlas from the actual semantic map (in the space of the four group semantic dimensions), squaring and summing the residuals and then dividing by the sum of squares in the actual map. The variance explained maps show that the PrAGMATiC atlas captures a large fraction of the variance in the semantic maps (37–47% in total).

  8. Comparison of PrAGMATiC models fit with different initial conditions.
    Extended Data Fig. 5: Comparison of PrAGMATiC models fit with different initial conditions.

    As with many clustering algorithms, PrAGMATiC optimizes a non-convex objective function and so can find many potential locally optimal solutions. To reduce the effect of non-convexity on our results, we re-fit the model ten times (each time with a different random initialization), and then selected the model fit that yielded the best likelihood (that is, performance on the training set) as the PrAGMATiC atlas (Fig. 3). Here we show the PrAGMATiC atlas (top) and the second best model out of the ten that were estimated (bottom). The parcellations given by these two models are very similar. However, there are a few differences, which illustrate uncertainty in the model. Some of these differences are due to statistical thresholding: a few areas that were found to be significantly semantically selective in the best model are missing in the alternative model (see left medial prefrontal cortex), and some significant areas in the alternate model are missing from the best model (left ventral occipital cortex). Other differences suggest alternative parcellations for a few regions, where, for example, the same region of cortex is parcellated into three areas in the best model and four areas in the alternative model. Yet it is clear that none of the differences between these two models are sufficient to change any of the interpretations given in the main text.

  9. Semantic atlas for the LPC.
    Extended Data Fig. 6: Semantic atlas for the LPC.

    The PrAGMATiC atlas divides the LPC into 15 areas in the left hemisphere and 13 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the LPC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average predicted response of each area to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely this 12-category interpretation captures the average semantic model in each area. The LPC appears to be organized around the angular gyrus (AG), with a core that is selective for social, emotional and mental concepts (L6, 7, 9, 11; R5, 7) and a periphery that is selective for visual, tactile and numeric concepts (L2, 4, 5, 8, 10, 15; R6, 11).

  10. Semantic atlas for the MPC.
    Extended Data Fig. 7: Semantic atlas for the MPC.

    The PrAGMATiC atlas divides the MPC into 14 areas in the left hemisphere and 10 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the MPC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average predicted response of each area to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. Like the LPC, the MPC appears to be organized around a core group of areas that are selective for social and mental concepts (L6, 8, 10; R6, 7). Dorsolateral MPC areas (L2, 4; R1) are selective for visual and tactile concepts. Anterior dorsal areas (L5, 9; R4, 9) are selective for temporal concepts. Ventral areas (L11, 12, 14; R8) are selective for professional, temporal and locational concepts. Just above the retrosplenial cortex one distinct area in each hemisphere is selective for mental, professional and temporal concepts (L7; R3). Overall, the right MPC responds more than the left MPC to mental concepts.

  11. Semantic atlas for the SPFC.
    Extended Data Fig. 8: Semantic atlas for the SPFC.

    The PrAGMATiC atlas divides the SPFC into 18 areas in the left hemisphere and 19 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the SPFC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. The organization in the SPFC seems to follow the long rostro-caudal sulci and gyri of the dorsal frontal lobe. Posterior–lateral SPFC areas (L4, 6; R6, 9, 11) are selective for social, emotional, communal and violent concepts. Posterior superior frontal sulcus areas (L2, 3, 7, 8; R1, 5, 7) are selective for visual, tactile and numeric concepts. The superior frontal gyrus contains a long strip of areas (L1, 5, 10, 12–15; R8, 12, 14–16) selective for social, emotional, communal and violent concepts.

  12. Semantic atlas for the LTC.
    Extended Data Fig. 9: Semantic atlas for the LTC.

    The PrAGMATiC atlas divides the LTC into eight areas in both the left and right hemispheres. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the LTC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. Anterior LTC areas (L4–8; R3–8) are selective for social, emotional, mental and violent concepts. Posterior LTC areas (L1–3; R1–2) are selective for numeric, tactile and visual concepts.

  13. Semantic atlas for the VTC.
    Extended Data Fig. 10: Semantic atlas for the VTC.

    The PrAGMATiC atlas divides the VTC into six areas in the left hemisphere and one area in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the VTC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. The VTC is relatively homogeneous: all areas are selective for numeric, tactile and visual concepts. Left VTC areas close to the parahippocampal place area (PPA) are also selective for locational concepts (L5–6).

  14. Semantic atlas for the IPFC.
    Extended Data Fig. 11: Semantic atlas for the IPFC.

    The PrAGMATiC atlas divides the IPFC into 12 areas in the left hemisphere and 9 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the IPFC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. Posterior IPFC areas in the precentral sulcus (L1–3; R1, 2) are selective for visual, tactile and numeric concepts. Areas on the inferior frontal gyrus (L8; R4, 7) are selective for social and violent concepts. Areas in the inferior frontal sulcus and anterior middle frontal gyrus (L4–7; R5–6) are selective for visual, tactile and numeric concepts. Areas in the orbitofrontal sulci (L10; R9) are also selective for visual, tactile, numeric and locational concepts.

  15. Semantic atlas for the opercular and insular cortex.
    Extended Data Fig. 12: Semantic atlas for the opercular and insular cortex.

    The PrAGMATiC atlas divides the opercular and insular cortex (OIC) into four areas in the left hemisphere and three areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the OIC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. These areas are homogeneously selective for abstract concepts, with more posterior and superior areas also responding to emotional, communal and mental concepts.

References

  1. Binder, J. R., Desai, R. H., Graves, W. W. & Conant, L. L. Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cereb. Cortex 19, 27672796 (2009)
  2. Lerner, Y., Honey, C. J., Silbert, L. J. & Hasson, U. Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. J. Neurosci. 31, 29062915 (2011)
  3. Friederici, A. D., Opitz, B. & von Cramon, D. Y. Segregating semantic and syntactic aspects of processing in the human brain: an fMRI investigation of different word types. Cereb. Cortex 10, 698705 (2000)
  4. Noppeney, U. & Price, C. J. Retrieval of abstract semantics. Neuroimage 22, 164170 (2004)
  5. Binder, J. R., Westbury, C. F., McKiernan, K. A., Possing, E. T. & Medler, D. A. Distinct brain systems for processing concrete and abstract concepts. J. Cogn. Neurosci. 17, 905917 (2005)
  6. Bedny, M., Caramazza, A., Grossman, E., Pascual-Leone, A. & Saxe, R. Concepts are more than percepts: the case of action verbs. J. Neurosci. 28, 1134711353 (2008)
  7. Saxe, R. & Kanwisher, N. People thinking about thinking people. The role of the temporo-parietal junction in “theory of mind”. Neuroimage 19, 18351842 (2003)
  8. Caramazza, A. & Shelton, J. R. Domain-specific knowledge systems in the brain the animate-inanimate distinction. J. Cogn. Neurosci. 10, 134 (1998)
  9. Mummery, C. J., Patterson, K., Hodges, J. R. & Price, C. J. Functional neuroanatomy of the semantic system: divisible by what? J. Cogn. Neurosci. 10, 766777 (1998)
  10. Just, M. A., Cherkassky, V. L., Aryal, S. & Mitchell, T. M. A neurosemantic theory of concrete noun representation based on the underlying brain codes. PLoS ONE 5, e8622 (2010)
  11. Warrington, E. K. The selective impairment of semantic memory. Q. J. Exp. Psychol. 27, 635657 (1975)
  12. Mitchell, T. M. et al. Predicting human brain activity associated with the meanings of nouns. Science 320, 11911195 (2008)
  13. Damasio, H., Grabowski, T. J., Tranel, D., Hichwa, R. D. & Damasio, A. R. A neural basis for lexical retrieval. Nature 380, 499505 (1996)
  14. Huth, A. G., Nishimoto, S., Vu, A. T. & Gallant, J. L. A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 76, 12101224 (2012)
  15. Wehbe, L. et al. Simultaneously uncovering the patterns of brain regions involved in different story reading subprocesses. PLoS ONE 9, e112575 (2014)
  16. Naselaris, T., Prenger, R. J., Kay, K. N., Oliver, M. & Gallant, J. L. Bayesian reconstruction of natural images from human brain activity. Neuron 63, 902915 (2009)
  17. Nishimoto, S. et al. Reconstructing visual experiences from brain activity evoked by natural movies. Curr. Biol. 21, 16411646 (2011)
  18. Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K. & Harshman, R. Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 391407 (1990)
  19. Lund, K. & Burgess, C. Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Methods Instrum. Comput. 28, 203208 (1996)
  20. Turney, P. D. & Pantel, P. From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37, 141188 (2010)
  21. Caramazza, A. & Mahon, B. Z. The organisation of conceptual knowledge in the brain: the future’s past and some future directions. Cogn. Neuropsychol. 23, 1338 (2006)
  22. Huth, A. G., Griffiths, T. L., Theunissen, F. E. & Gallant, J. L. PrAGMATiC: a probabilistic and generative model of areas tiling the cortex . Preprint at http://arxiv.org/abs/1504.03622 (2015)
  23. Amunts, K., Malikovic, A., Mohlberg, H., Schormann, T. & Zilles, K. Brodmann’s areas 17 and 18 brought into stereotaxic space—where and how variable? Neuroimage 11, 6684 (2000)
  24. Fedorenko, E., Hsieh, P.-J., Nieto-Castañón, A., Whitfield-Gabrieli, S. & Kanwisher, N. New method for fMRI investigations of language: defining ROIs functionally in individual subjects. J. Neurophysiol. 104, 11771194 (2010)
  25. Hinton, G. E. Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 17711800 (2002)
  26. Buckner, R. L., Andrews-Hanna, J. R. & Schacter, D. L. The brain’s default network: anatomy, function, and relevance to disease. Ann. NY Acad. Sci . 1124, 138 (2008)
  27. DeWitt, I. & Rauschecker, J. P. Phoneme and word recognition in the auditory ventral stream. Proc. Natl Acad. Sci. USA 109, E505E514 (2012)
  28. Riesenhuber, M. Appearance isn’t everything: news on object representation in cortex. Neuron 55, 341344 (2007)
  29. Dehaene, S., Cohen, L., Sigman, M. & Vinckier, F. The neural code for written words: a proposal. Trends Cogn. Sci. 9, 335341 (2005)
  30. Op de Beeck, H. P., Haushofer, J. & Kanwisher, N. G. Interpreting fMRI data: maps, modules and dimensions. Nature Rev. Neurosci. 9, 123135 (2008)
  31. Caspers, S. et al. Organization of the human inferior parietal lobule based on receptor architectonics. Cereb. Cortex 23, 615628 (2013)
  32. Cohen, A. L. et al. Defining functional areas in individual human brains using resting functional connectivity MRI. Neuroimage 41, 4557 (2008)
  33. Yuan, J. & Liberman, M. Speaker identification on the SCOTUS corpus. Proc. Acoust. Preprint at http://www.ling.upenn.edu/~jiahong/publications/c09.pdf (2008)
  34. Boersma, P. & Weenink, D. Praat: doing phonetics by computer (University of Amsterdam, 2014)
  35. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. B 57, 289300 (1995)
  36. Oliphant, T. E. Guide to NumPy (Brigham Young University, 2006)
  37. Jones, E., Oliphant, T. E. & Peterson, P. SciPy: Open source scientific tools for Python (SciPy, 2001)
  38. Gao, J. S., Huth, A. G., Lescroart, M. D. & Gallant, J. L. Pycortex: an interactive surface visualizer for fMRI. Front. Neuroinform . 9, 23 (2015)

Download references

Author information

Affiliations

  1. Helen Wills Neuroscience Institute, University of California, Berkeley, California 94720, USA

    • Alexander G. Huth,
    • Thomas L. Griffiths,
    • Frédéric E. Theunissen &
    • Jack L. Gallant
  2. Department of Psychology, University of California, Berkeley, California 94720, USA

    • Wendy A. de Heer,
    • Thomas L. Griffiths,
    • Frédéric E. Theunissen &
    • Jack L. Gallant

Contributions

All authors helped conceive and design the experiment. W.A.d.H. and A.G.H. selected and annotated stimuli and collected fMRI data. A.G.H. analysed the data. A.G.H. and T.L.G. designed the PrAGMATiC generative model. A.G.H. and J.L.G. wrote the paper. J.L.G. contributed to all aspects of the project.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Voxel-wise model prediction performance. (1,838 KB)

    Cortical flatmaps showing prediction performance of voxel-wise semantic models for all seven subjects, formatted similarly to Fig. 1c. Models were tested using one 10-min story that was not included during model estimation. Prediction performance was then computed as the correlation between predicted and measured BOLD responses. Left column, raw prediction performance. Note that the colourmap here is scaled 0–1 rather than 0–0.6 as in Fig. 1c to match the scale of the adjusted prediction performance maps. Right column, prediction performance corrected to account for different amounts of noise in the BOLD responses (see Supplementary Methods for details). The voxel-wise semantic models predict BOLD responses in many brain areas, including SPFC, IPFC, LTC, VTC, LPC and MPC. These same regions have been previously identified as the semantic system in the human brain.

  2. Extended Data Figure 2: Amount of variance explained by individual subject and group semantic dimensions. (166 KB)

    Principal components analysis was used to discover the most important semantic dimensions from voxel-wise semantic model weights in each subject. To reduce noise, we used only the 10,000 best voxels in each subject, determined by cross-validation within the model estimation data set. Here we show the amount of variance explained in the semantic model weights by each of the 20 most important principal components (PCs). Orange lines show the amount of variance explained by each subject’s own PCs, blue lines show the variance explained by the PCs of combined data from the other six subjects, and grey lines show the variance explained by the PCs of the stories. (The Gale–Shapley stable marriage algorithm was used to re-order the group and stimulus PCs to maximize their correlation with the subject’s PCs.) Error bars indicate 99% confidence intervals. Confidence intervals for the subjects’ own PCs and group PCs are very small. Hollow markers indicate subject or group PCs that explain significantly more variance than the corresponding stimulus PCs (P < 0.001, bootstrap test). Six PCs explain significantly more variance in one out of seven subjects, five PCs in two subjects, four PCs in three subjects, and three PCs in one subject. Thus, four PCs seem to comprise a semantic space that is common across most individuals.

  3. Extended Data Figure 3: Separate cortical projections of semantic dimensions 1–4 on subject S2 and combined cortical projections of dimensions 1–3 for subjects S1, S3 and S4. (943 KB)

    a, Voxel-wise semantic model weights for subject S2 were projected onto each of the common semantic dimensions defined by PCs 1–4. Voxels for which model generalization performance was not significantly greater than zero (q(FDR) > 0.05) are shown in grey. Positive projections are shown in red, negative projections in blue and near-zero projections in white. Voxels with fMRI signal dropout due to field inhomogeneity are shaded with black hatched lines. b. Like Fig. 2b, c, this panel shows the result of projecting voxel-wise models onto the first three common semantic dimensions, and then colouring each voxel using an RGB colourmap. The red colour component corresponds to the projection on the first PC, the green component to the second, and the blue component to the third. Semantic information seems to be represented in complex patterns distributed across the semantic system and the patterns seem to be largely conserved across individuals.

  4. Extended Data Figure 4: PrAGMATiC atlas likelihood maps. (668 KB)

    Comparison of actual semantic maps (Fig. 2, Extended Data Fig. 3) to the maps generated from the PrAGMATiC atlas (Fig. 3). PrAGMATiC atlases for the left and right hemispheres were fit using data from all seven subjects. The left hemisphere atlas has 192 total areas and the right hemisphere has 128 (including non-semantic areas). Here we show the actual semantic maps for four subjects (first column), the PrAGMATiC atlas on each subject’s cortical surface (second column), the log likelihood ratio of the actual semantic map under the PrAGMATiC atlas versus a null model (third column), and the fraction of variance in the semantic map that the PrAGMATiC atlas explains for each location on the cortical surface (fourth column). The likelihood ratio maps show that most areas where there are large semantic model weights (that is, the semantic system) are much better explained by PrAGMATiC than by a null model and thus appear red, while areas where the weights are small (that is, somatomotor cortex, visual cortex, and so on) are about equally well explained by both PrAGMATiC and the null model and thus appear white. Variance explained was computed by subtracting the PrAGMATiC atlas from the actual semantic map (in the space of the four group semantic dimensions), squaring and summing the residuals and then dividing by the sum of squares in the actual map. The variance explained maps show that the PrAGMATiC atlas captures a large fraction of the variance in the semantic maps (37–47% in total).

  5. Extended Data Figure 5: Comparison of PrAGMATiC models fit with different initial conditions. (582 KB)

    As with many clustering algorithms, PrAGMATiC optimizes a non-convex objective function and so can find many potential locally optimal solutions. To reduce the effect of non-convexity on our results, we re-fit the model ten times (each time with a different random initialization), and then selected the model fit that yielded the best likelihood (that is, performance on the training set) as the PrAGMATiC atlas (Fig. 3). Here we show the PrAGMATiC atlas (top) and the second best model out of the ten that were estimated (bottom). The parcellations given by these two models are very similar. However, there are a few differences, which illustrate uncertainty in the model. Some of these differences are due to statistical thresholding: a few areas that were found to be significantly semantically selective in the best model are missing in the alternative model (see left medial prefrontal cortex), and some significant areas in the alternate model are missing from the best model (left ventral occipital cortex). Other differences suggest alternative parcellations for a few regions, where, for example, the same region of cortex is parcellated into three areas in the best model and four areas in the alternative model. Yet it is clear that none of the differences between these two models are sufficient to change any of the interpretations given in the main text.

  6. Extended Data Figure 6: Semantic atlas for the LPC. (573 KB)

    The PrAGMATiC atlas divides the LPC into 15 areas in the left hemisphere and 13 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the LPC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average predicted response of each area to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely this 12-category interpretation captures the average semantic model in each area. The LPC appears to be organized around the angular gyrus (AG), with a core that is selective for social, emotional and mental concepts (L6, 7, 9, 11; R5, 7) and a periphery that is selective for visual, tactile and numeric concepts (L2, 4, 5, 8, 10, 15; R6, 11).

  7. Extended Data Figure 7: Semantic atlas for the MPC. (447 KB)

    The PrAGMATiC atlas divides the MPC into 14 areas in the left hemisphere and 10 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the MPC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average predicted response of each area to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. Like the LPC, the MPC appears to be organized around a core group of areas that are selective for social and mental concepts (L6, 8, 10; R6, 7). Dorsolateral MPC areas (L2, 4; R1) are selective for visual and tactile concepts. Anterior dorsal areas (L5, 9; R4, 9) are selective for temporal concepts. Ventral areas (L11, 12, 14; R8) are selective for professional, temporal and locational concepts. Just above the retrosplenial cortex one distinct area in each hemisphere is selective for mental, professional and temporal concepts (L7; R3). Overall, the right MPC responds more than the left MPC to mental concepts.

  8. Extended Data Figure 8: Semantic atlas for the SPFC. (505 KB)

    The PrAGMATiC atlas divides the SPFC into 18 areas in the left hemisphere and 19 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the SPFC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. The organization in the SPFC seems to follow the long rostro-caudal sulci and gyri of the dorsal frontal lobe. Posterior–lateral SPFC areas (L4, 6; R6, 9, 11) are selective for social, emotional, communal and violent concepts. Posterior superior frontal sulcus areas (L2, 3, 7, 8; R1, 5, 7) are selective for visual, tactile and numeric concepts. The superior frontal gyrus contains a long strip of areas (L1, 5, 10, 12–15; R8, 12, 14–16) selective for social, emotional, communal and violent concepts.

  9. Extended Data Figure 9: Semantic atlas for the LTC. (497 KB)

    The PrAGMATiC atlas divides the LTC into eight areas in both the left and right hemispheres. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the LTC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. Anterior LTC areas (L4–8; R3–8) are selective for social, emotional, mental and violent concepts. Posterior LTC areas (L1–3; R1–2) are selective for numeric, tactile and visual concepts.

  10. Extended Data Figure 10: Semantic atlas for the VTC. (415 KB)

    The PrAGMATiC atlas divides the VTC into six areas in the left hemisphere and one area in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the VTC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. The VTC is relatively homogeneous: all areas are selective for numeric, tactile and visual concepts. Left VTC areas close to the parahippocampal place area (PPA) are also selective for locational concepts (L5–6).

  11. Extended Data Figure 11: Semantic atlas for the IPFC. (527 KB)

    The PrAGMATiC atlas divides the IPFC into 12 areas in the left hemisphere and 9 areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the IPFC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. Posterior IPFC areas in the precentral sulcus (L1–3; R1, 2) are selective for visual, tactile and numeric concepts. Areas on the inferior frontal gyrus (L8; R4, 7) are selective for social and violent concepts. Areas in the inferior frontal sulcus and anterior middle frontal gyrus (L4–7; R5–6) are selective for visual, tactile and numeric concepts. Areas in the orbitofrontal sulci (L10; R9) are also selective for visual, tactile, numeric and locational concepts.

  12. Extended Data Figure 12: Semantic atlas for the opercular and insular cortex. (480 KB)

    The PrAGMATiC atlas divides the opercular and insular cortex (OIC) into four areas in the left hemisphere and three areas in the right. Here we show the atlas for each hemisphere (top left and right), three-dimensional brains indicating the location of the OIC (top middle), individual maps for two subjects in each hemisphere (bottom middle), and the average response of each area in the atlas to the 12 semantic categories identified earlier (responses consistently greater than zero across subjects are marked with a plus) (bottom left and right). Bars show how completely the 12-category interpretation captures the average semantic model in each area. These areas are homogeneously selective for abstract concepts, with more posterior and superior areas also responding to emotional, communal and mental concepts.

Supplementary information

PDF files

  1. Supplementary Information (1.2 MB)

    This file contains Supplementary Data, Supplementary methods, Supplementary Tables 1-3 and Supplementary References.

Additional data