Abstract
Long-standing affective science theories conceive the perception of emotional stimuli either as discrete categories (for example, an angry voice) or continuous dimensional attributes (for example, an intense and negative vocal emotion). Which position provides a better account is still widely debated. Here we contrast the positions to account for acoustics-independent perceptual and cerebral representational geometry of perceived voice emotions. We combined multimodal imaging of the cerebral response to heard vocal stimuli (using functional magnetic resonance imaging and magneto-encephalography) with post-scanning behavioural assessment of voice emotion perception. By using representational similarity analysis, we find that categories prevail in perceptual and early (less than 200 ms) frontotemporal cerebral representational geometries and that dimensions impinge predominantly on a later limbic–temporal network (at 240 ms and after 500 ms). These results reconcile the two opposing views by reframing the perception of emotions as the interplay of cerebral networks with different representational dynamics that emphasize either categories or dimensions.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The following materials are available from a Dryad repository (https://datadryad.org/stash/dataset/doi:10.5061/dryad.m905qfv0k): single-trial behavioural data, single-cross-validation fold fMRI data, and single-trial MEG data for all participants; anonymized anatomical information required to reconstruct the MEG sources and deform native-space statistical maps to DARTEL and MNI space; and sound stimuli and MTF representations.
Code availability
The Matlab code for reconstructing the MEG sources, carrying out a group-level RSA analysis of the fMRI and MEG representation of perceived emotions, and generating MNI-space statistical maps is available at the following Dryad repository: https://datadryad.org/stash/dataset/doi:10.5061/dryad.m905qfv0k.
References
Ekman, P. in The Science of Facial Expression (eds Fernandez-Dols, J. M. & Russell, J. A.), 39–56 (Oxford Univ. Press, 2017).
Sauter, D. A. & Eimer, M. Rapid detection of emotion from human vocalizations. J. Cogn. Neurosci. 22, 474–481 (2010).
Russell, J. A. Core affect and the psychological construction of emotion. Psychol. Rev. 110, 145–172 (2003).
Barrett, L. F. The theory of constructed emotion: an active inference account of interoception and categorization. Soc. Cogn. Affect. Neurosci. 12, 1–23 (2017).
Hamann, S. Mapping discrete and dimensional emotions onto the brain: controversies and consensus. Trends Cogn. Sci. 16, 458–466 (2012).
Vytal, K. & Hamann, S. Neuroimaging support for discrete neural correlates of basic emotions: a voxel-based meta-analysis. J. Cogn. Neurosci. 22, 2864–2885 (2010).
Lindquist, K. A., Wager, T. D., Kober, H., Bliss-Moreau, E. & Barrett, L. F. The brain basis of emotion: a meta-analytic review. Behav. Brain Sci. 35, 121–143 (2012).
Kober, H. et al. Functional grouping and cortical-subcortical interactions in emotion: a meta-analysis of neuroimaging studies. Neuroimage 42, 998–1031 (2008).
Rolls, E. T., Grabenhorst, F. & Franco, L. Prediction of subjective affective state from brain activations. J. Neurophysiol. 101, 1294–1308 (2009).
Kotz, S. A., Kalberlah, C., Bahlmann, J., Friederici, A. D. & Haynes, J. D. Predicting vocal emotion expressions from the human brain. Hum. Brain Mapp. 34, 1971–1981 (2013).
Skerry, A. E. & Saxe, R. Neural representations of emotion are organized around abstract event features. Curr. Biol. 25, 1945–1954 (2015).
Saarimaki, H. et al. Discrete neural signatures of basic emotions. Cereb. Cortex 26, 2563–2573 (2016).
Kragel, P. A. & LaBar, K. S. Decoding the nature of emotion in the brain. Trends Cogn. Sci. 20, 444–455 (2016).
Briesemeister, B. B., Kuchinke, L. & Jacobs, A. M. Emotion word recognition: discrete information effects first, continuous later? Brain Res. 1564, 62–71 (2014).
Grootswagers, T. & Kennedy, B. L. & Most, S. B. & Carlson, T. A. Neural signatures of dynamic emotion constructs in the human brain. Neuropsychologia 145, 106535 (2020).
Belin, P., Fillion-Bilodeau, S. & Gosselin, F. The ‘Montreal Affective Voices’: a validated set of nonverbal affect bursts for research on auditory affective processing. Behav. Brain Res. 40, 531–539 (2008).
Kriegeskorte, N., Mur, M. & Bandettini, P. Representational similarity analysis—connecting the branches of systems neuroscience. Front. Syst. Neurosci. 2, 1–28 (2009).
Chi, T., Ru, P. & Shamma, S. A. Multiresolution spectrotemporal analysis of complex sounds. J. Acoust. Soc. Am. 118, 887–906 (2005).
Belyk, M., Brown, S., Lim, J. & Kotz, S. A. Convergence of semantics and emotional expression within the IFG pars orbitalis. Neuroimage 156, 240–248 (2017).
Touroutoglou, A. et al. A ventral salience network in the macaque brain. Neuroimage 132, 190–197 (2016).
Anderson, D. J. & Adolphs, R. A framework for studying emotions across species. Cell 157, 187–200 (2014).
Cowen, A. S. & Keltner, D. Self-report captures 27 distinct categories of emotion bridged by continuous gradients. Proc. Natl Acad. Sci. USA 114, E7900–E7909 (2017).
Cowen, A. S., Laukka, P., Elfenbein, H. A., Liu, R. & Keltner, D. The primacy of categories in the recognition of 12 emotions in speech prosody across two cultures. Nat. Hum. Behav. 3, 369–382 (2019).
Giordano, B. L. et al. Contributions of local speech encoding and functional connectivity to audio-visual speech perception. eLife 6, e24763 (2017).
Pessoa, L. Understanding emotion with brain networks. Curr. Opin. Behav. Sci. 19, 19–25 (2018).
Vaux, D. L., Fidler, F. & Cumming, G. Replicates and repeats-what is the difference and is it significant? A brief discussion of statistics and experimental design. EMBO Rep. 13, 291–296 (2012).
Kawahara, H. & Matsui, H. Auditory morphing based on an elastic perceptual distance metric in an interference-free time-frequency representation. In Proc. 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing 256–259 (2003).
Hutton, C. et al. Image distortion correction in fMRI: A quantitative evaluation. Neuroimage 16, 217–240 (2002).
Santoro, R. et al. Encoding of natural sounds at multiple spectral and temporal resolutions in the human auditory cortex. PLoS Comput. Biol. 10, e1003412 (2014).
Kell, A. J. E., Yamins, D. L. K., Shook, E. N., Norman-Haignere, S. V. & McDermott, J. H. A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron 98, 630–644 (2018).
Cao, Y., Summerfield, C., Park, H., Giordano, B. L. & Kayser, C. Causal inference in the multisensory brain. Neuron 102, 1076–1087 (2019).
Mantel, N. The detection of disease clustering and a generalized regression approach. Cancer Res. 27, 209–220 (1967).
Oostenveld, R. & Fries, P. & Maris, E. & Schoffelen, J. M. FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput. Intell. Neurosci. 2011, 156869 (2011).
Kay, K. N., Rokem, A., Winawer, J., Dougherty, R. F. & Wandell, B. A. GLMdenoise: a fast, automated technique for denoising task-based fMRI data. Front. Neurosci. 7, 247 (2013).
Ashburner, J. A fast diffeomorphic image registration algorithm. Neuroimage 38, 95–113 (2007).
Hipp, J. F. & Siegel, M. Dissociating neuronal gamma-band activity from cranial and ocular muscle activity in EEG. Front. Hum. Neurosci. 7, 338 (2013).
Cichy, R. M., Pantazis, D. & Oliva, A. Resolving human object recognition in space and time. Nat. Neurosci. 17, 455–462 (2014).
Cichy, R. M. & Pantazis, D. Multivariate pattern analysis of MEG and EEG: a comparison of representational structure in time and space. Neuroimage 158, 441–454 (2017).
Walther, A. et al. Reliability of dissimilarity measures for multi-voxel pattern analysis. Neuroimage 137, 188–200 (2016).
Diedrichsen, J. & Kriegeskorte, N. Representational models: a common framework for understanding encoding, pattern-component, and representational-similarity analysis. PLoS Comput. Biol. 13, e1005508 (2017).
Maris, E. & Oostenveld, R. Nonparametric statistical testing of EEG- and MEG-data. J. Neurosci. Methods 164, 177–190 (2007).
Rolls, E. T., Joliot, M. & Tzourio-Mazoyer, N. Implementation of a new parcellation of the orbitofrontal cortex in the automated anatomical labeling atlas. NeuroImage 122, 1–5 (2015).
De Leeuw, J. & Mair, P. Multidimensional scaling using majorization: SMACOF in R. J. Stat. Softw. 31, 1–30 (2009).
Ashby, F. G., Boynton, G. & Lee, W. W. Categorization response time with multidimensional stimuli. Percept. Psychophys. 55, 11–27 (1994).
Fonov, V. et al. Unbiased average age-appropriate atlases for pediatric studies. Neuroimage 54, 313–327 (2011).
Acknowledgements
This work was supported by the UK Biotechnology and Biological Sciences Research Council (grants BB/M009742/1 to J.G., B.L.G., S.A.K. and P.B., and BB/L023288/1 to P.B. and J.G.), by the French Fondation pour la Recherche Médicale (grant AJE201214 to P.B.), and by Research supported by grants ANR-16-CONV-0002 (ILCB), ANR-11-LABX-0036 (BLRI), and the Excellence Initiative of Aix-Marseille University (A*MIDEX). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. We thank O. Coulon and O. Garrod for help with the development of the 3D glass brain, as well as Y. Cao, I. Charest, C. Crivelli, B. De Gelder, G. Masson, R. A. A. Ince, F. Kusnir, S. McAdams and R. J. Zatorre for useful comments on previous versions of the manuscript.
Author information
Authors and Affiliations
Contributions
Conceptualization: B.L.G. and P.B.; methodology: B.L.G., C.W., N.K., S.A.K., P.B. and J.G.; software: B.L.G.; validation: B.L.G.; formal analysis: B.L.G., C.W. and J.G.; investigation: B.L.G. and C.W.; resources: B.L.G. and P.B.; data curation: B.L.G. and C.W.; writing, original draft: B.L.G., C.W., S.A.K., P.B. and J.G.; writing, review and editing: B.L.G., C.W., N.K., S.A.K., P.B. and J.G.; visualization: B.L.G.; supervision: B.L.G., P.B. and J.G.; project administration: J.G.; and funding acquisition: B.L.G., S.A.K., P.B. and J.G.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Human Behaviour thanks Behtash Babadi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Jamie Horder; Marike Schiffer.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–4 and Supplementary Table 1.
Supplementary audio 1
Sound stimuli
Rights and permissions
About this article
Cite this article
Giordano, B.L., Whiting, C., Kriegeskorte, N. et al. The representational dynamics of perceived voice emotions evolve from categories to dimensions. Nat Hum Behav 5, 1203–1213 (2021). https://doi.org/10.1038/s41562-021-01073-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41562-021-01073-0