Abstract
Is the singing voice processed distinctively in the human brain? In this Perspective, we discuss what might distinguish song processing from speech processing in light of recent work suggesting that some cortical neuronal populations respond selectively to song and we outline the implications for our understanding of auditory processing. We review the literature regarding the neural and physiological mechanisms of song production and perception and show that this provides evidence for key differences between song and speech processing. We conclude by discussing the significance of the notion that song processing is special in terms of how this might contribute to theories of the neurobiological origins of vocal communication and to our understanding of the neural circuitry underlying sound processing in the human cortex.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Lindblom, B. & Sundberg, J. In Handbook of Acoustics 669–712 (Springer-Verlag, 2007).
Gerhard, D. Pitch-based acoustic feature analysis for the discrimination of speech and monophonic singing. Can. Acoust. 30, 152–153 (2002).
Albouy, P., Mehr, S. A., Hoyer, R. S., Ginzburg, J. & Zatorre, R. J. Spectro-temporal acoustical markers differentiate speech from song across cultures. Preprint at bioRxiv, https://doi.org/10.1101/2023.01.29.526133 (2023).
Yu, C. Y., Cabildo, A., Grahn, J. A. & Vanden Bosch der Nederlanden, C. M. Perceived rhythmic regularity is greater for song than speech: examining acoustic correlates of rhythmic regularity in speech and song. Front. Psychol. 14, 1167003 (2023).
Scott, S. K. The neural control of volitional vocal production — from speech to identity, from social meaning to song. Philos. Trans. R. Soc. B Biol. Sci. 377, 20200395 (2022).
Zuk, J., Loui, P. & Guenther, F. H. Neural control of speaking and singing: The DIVA Model for Singing. Preprint at PsyArXiv https://doi.org/10.31234/osf.io/xqtc9 (2022).
Blood, A. J. & Zatorre, R. J. Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion. Proc. Natl Acad. Sci. USA 98, 11818–11823 (2001).
Zatorre, R. J. Musical pleasure and reward: mechanisms and dysfunction. Ann. N. Y. Acad. Sci. 1337, 202–211 (2015).
Tierney, A., Dick, F., Deutsch, D. & Sereno, M. Speech versus song: multiple pitch-sensitive areas revealed by a naturally occurring musical illusion. Cereb. Cortex 23, 249–254 (2013).
Scharinger, M., Knoop, C. A., Wagner, V. & Menninghaus, W. Neural processing of poems and songs is based on melodic properties. NeuroImage 257, 119310 (2022).
Cohen, A., Levitin, D. & Kleber, B. In The Routledge Companion to Interdisciplinary Studies in Singing (ed. Russo, F. A. et al.) 79–96 (Routledge, 2020).
Peretz, I., Gagnon, L., Hebert, S. & Macoir, J. Singing in the brain: insights from cognitive neuropsychology. Music Percept. 21, 373–390 (2004).
Norman-Haignere, S. V. et al. A neural population selective for song in human auditory cortex. Curr. Biol. 32, 1470–1484.e12 (2022).
Trainor, L. J., Clark, E. D., Huntley, A. & Adams, B. A. The acoustic basis of preferences for infant-directed singing. Infant. Behav. Dev. 20, 383–396 (1997).
Trainor, L. J., Austin, C. M. & Desjardins, R. N. Is infant-directed speech prosody a result of the vocal expression of emotion? Psychol. Sci. 11, 188–195 (2000).
Masataka, N. Preference for infant-directed singing in 2-day-old hearing infants of deaf parents. Dev. Psychol. 35, 1001–1005 (1999).
Brown, S. & Jordania, J. Universals in the world’s musics. Psychol. Music 41, 229–248 (2013).
Mehr, S. A. et al. Universality and diversity in human song. Science 366, eaax0868 (2019).
Mithen, S. J. The Singing Neanderthals: The Origins of Music, Language, Mind, and Body (Harvard University Press, 2006).
Brandt, A., Slevc, R. & Gebrian, M. Music and early language acquisition. Front. Psychol. 3, 327 (2012).
Fitch, W. T. The Evolution of Language, 466–507 (Cambridge Univ. Press, 2010).
Cross, I. in Music, Mind and Science (ed Suk, W. Y.) 10–39 (Seoul National Univ. Press, 1999).
Haiduk, F. & Fitch, W. T. Understanding design features of music and language: the choric/dialogic distinction. Front. Psychol. 13, 786899 (2022).
Cross, I. in Musical Communication (eds Miell, D., MacDonald, R. & Hargreaves, D.) 27–43 (Oxford Univ. Press, 2005).
Savage, P. E. et al. Music as a coevolved system for social bonding. Behav. Brain Sci. 44, e59 (2021).
Mehr, S. A., Singh, M., York, H., Glowacki, L. & Krasnow, M. M. Form and function in human song. Curr. Biol. 28, 356–368.e5 (2018).
Unyk, A. M., Trehub, S. E., Trainor, L. J. & Schellenberg, E. G. Lullabies and simplicity: a cross-cultural perspective. Psychol. Music 20, 15–28 (1992).
Trehub, S. E., Unyk, A. M. & Trainor, L. J. Maternal singing in cross-cultural perspective. Infant. Behav. Dev. 16, 285–295 (1993).
Trehub, S. & Trainor, L. Singing to infants: lullabies and play songs. Adv. Infancy Res. 12, 43–77 (1998).
Desain, P. & Honing, H. The quantization of musical time: a connectionist approach. Comput. Music J. 13, 56–66 (1989).
Large, E. W. & Snyder, J. S. Pulse and meter as neural resonance. Ann. N. Y. Acad. Sci. 1169, 46–57 (2009).
Peper, C. E., Beek, P. J. & van Wieringen, P. C. W. Multifrequency coordination in bimanual tapping: asymmetrical coupling and signs of supercriticality. J. Exp. Psychol. Hum. Percept. Perform. 21, 1117–1138 (1995).
Bååth, R., Lagerstedt, E. & Gärdenfors, P. An oscillator model of categorical rhythm perception. Proc. Annu. Meet. Cogn. Sci. Soc. 35, 1803–1808 (2013).
Jacoby, N. & McDermott, J. H. Integer ratio priors on musical rhythm revealed cross-culturally by iterated reproduction. Curr. Biol. 27, 359–370 (2017).
Jacoby, N. et al. Universal and non-universal features of musical pitch perception revealed by singing. Curr. Biol. 29, 3229–3243.e12 (2019).
Patel, A. D. Music, Language, and the Brain (Oxford Univ. Press, 2010).
Peretz, I., Vuvan, D., Lagrois, M.-É. & Armony, J. L. Neural overlap in processing music and speech. Philos. Trans. R. Soc. B Biol. Sci. 370, 20140090 (2015).
Zatorre, R. J., Chen, J. L. & Penhune, V. B. When the brain plays music: auditory–motor interactions in music perception and production. Nat. Rev. Neurosci. 8, 547–558 (2007).
Fant, G. Acoustic Theory of Speech Production: With Calculations Based on X-ray Studies of Russian Articulations (Walter de Gruyter, 1971).
Briefer, E. & McElligott, A. G. Indicators of age, body size and sex in goat kid calls revealed using the source–filter theory. Appl. Anim. Behav. Sci. 133, 175–185 (2011).
Titze, I. R. Principles of Voice Production (Prentice Hall, 1994).
Titze, I. R. Nonlinear source–filter coupling in phonation: theory. J. Acoust. Soc. Am. 123, 2733–2749 (2008).
Tokuda, I. in Oxford Research Encyclopedia of Linguistics (eds Aronoff, M. et al.) https://doi.org/10.1093/acrefore/9780199384655.013.894 (2021).
Taylor, A. M. & Reby, D. The contribution of source–filter theory to mammal vocal communication research. J. Zool. 280, 221–236 (2010).
Lieberman, P. The Biology and Evolution of Language (Harvard Univ. Press, 1984).
Zatorre, R. J. & Baum, S. R. Musical melody and speech intonation: singing a different tune. PLoS Biol. 10, e1001372 (2012).
Ozaki, Y. et al. Globally, songs and instrumental melodies are slower, higher, and use more stable pitches than speech [Stage 2 Registered Report]. Preprint at PsyArXiv https://doi.org/10.31234/osf.io/jr9x7 (2023).
Ozdemir, E., Norton, A. & Schlaug, G. Shared and distinct neural correlates of singing and speaking. NeuroImage 33, 628–635 (2006).
Belyk, M. et al. Human larynx motor cortices coordinate respiration for vocal-motor control. NeuroImage 239, 118326 (2021).
Wich, S. A. et al. A case of spontaneous acquisition of a human sound by an orangutan. Primates 50, 56–64 (2009).
Iwatsubo, T., Kuzuhara, S., Kanemitsu, A., Shimada, H. & Toyokura, Y. Corticofugal projections to the motor nuclei of the brainstem and spinal cord in humans. Neurology 40, 309–309 (1990).
Kuypers, H. G. Corticobular connexions to the pons and lower brain-stem in man: an anatomical study. Brain J. Neurol. 81, 364–388 (1958).
Arriaga, G., Zhou, E. P. & Jarvis, E. D. Of mice, birds, and men: the mouse ultrasonic song system has some features similar to humans and song-learning birds. PLoS ONE 7, e46610 (2012).
Kumar, V., Croxson, P. L. & Simonyan, K. Structural organization of the laryngeal motor cortical network and its implication for evolution of speech production. J. Neurosci. J. Soc. Neurosci. 36, 4170–4181 (2016).
Rauschecker, J. P. Where did language come from? Precursor mechanisms in nonhuman primates. Curr. Opin. Behav. Sci. 21, 195–204 (2018).
Gisladottir, R. S. et al. Sequence variants affecting voice pitch in humans. Sci. Adv. 9, eabq2969 (2023).
Blank, S. C., Scott, S. K., Murphy, K., Warburton, E. & Wise, R. J. S. Speech production: Wernicke, Broca and beyond. Brain 125, 1829–1838 (2002).
Belyk, M. & Brown, S. The origins of the vocal brain in humans. Neurosci. Biobehav. Rev. 77, 177–193 (2017).
Kriegstein, K. V. & Giraud, A.-L. Distinct functional substrates along the right superior temporal sulcus for the processing of voices. NeuroImage 22, 948–955 (2004).
Hyde, K. L., Peretz, I. & Zatorre, R. J. Evidence for the role of the right auditory cortex in fine pitch resolution. Neuropsychologia 46, 632–639 (2008).
Zarate, J. M. The neural control of singing. Front. Hum. Neurosci. 7, 237 (2013).
Patterson, R. D., Uppenkamp, S., Johnsrude, I. S. & Griffiths, T. D. The processing of temporal pitch and melody information in auditory cortex. Neuron 36, 767–776 (2002).
Kleber, B. et al. Voxel-based morphometry in opera singers: increased gray-matter volume in right somatosensory and auditory cortices. NeuroImage 133, 477–483 (2016).
Zamorano, A. M. et al. Singing training predicts increased insula connectivity with speech and respiratory sensorimotor areas at rest. Brain Res. 1813, 148418 (2023).
Kleber, B., Veit, R., Birbaumer, N., Gruzelier, J. & Lotze, M. The brain of opera singers: experience-dependent changes in functional activation. Cereb. Cortex 20, 1144–1152 (2010).
Dronkers, N. F. A new brain region for coordinating speech articulation. Nature 384, 159–161 (1996).
Wise, R. J., Greene, J., Büchel, C. & Scott, S. K. Brain regions involved in articulation. Lancet 353, 1057–1061 (1999).
Kleber, B., Zeitouni, A. G., Friberg, A. & Zatorre, R. J. Experience-dependent modulation of feedback integration during singing: Role of the right anterior insula. J. Neurosci. 33, 6070–6080 (2013).
Kleber, B., Friberg, A., Zeitouni, A. & Zatorre, R. Experience-dependent modulation of right anterior insula and sensorimotor regions as a function of noise-masked auditory feedback in singers and nonsingers. NeuroImage 147, 97–110 (2017).
Riecker, A., Ackermann, H., Wildgruber, D., Dogil, G. & Grodd, W. Opposite hemispheric lateralization effects during speaking and singing at motor cortex, insula and cerebellum. Neuroreport 11, 1997–2000 (2000).
Ackermann, H. & Riecker, A. The contribution of the insula to motor aspects of speech production: a review and a hypothesis. Brain Lang. 89, 320–328 (2004).
Oh, A., Duerden, E. G. & Pang, E. W. The role of the insula in speech and language processing. Brain Lang. 135, 96–103 (2014).
Finkel, S. et al. Intermittent theta burst stimulation over right somatosensory larynx cortex enhances vocal pitch-regulation in nonsingers. Hum. Brain Mapp. 40, 2174–2187 (2019).
McGettigan, C. et al. T’ain’t what you say, it’s the way that you say it — left insula and inferior frontal cortex work in interaction with superior temporal regions to control the performance of vocal impersonations. J. Cogn. Neurosci. 25, 1875–1886 (2013).
Jasmin, K. M. et al. Cohesion and joint speech: right hemisphere contributions to synchronized vocal production. J. Neurosci. 36, 4669–4680 (2016).
Blank, S. C., Bird, H., Turkheimer, F. & Wise, R. J. S. Speech production after stroke: the role of the right pars opercularis. Ann. Neurol. 54, 310–320 (2003).
Scott, S. K. Auditory processing — speech, space and auditory objects. Curr. Opin. Neurobiol. 15, 197–201 (2005).
Rauschecker, J. P. & Scott, S. K. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat. Neurosci. 12, 718–724 (2009).
Whitehead, J. C. & Armony, J. L. Singing in the brain: neural representation of music and voice as revealed by fMRI. Hum. Brain Mapp. 39, 4913–4924 (2018).
Agnew, Z. K., McGettigan, C. & Scott, S. K. Discriminating between auditory and motor cortical responses to speech and nonspeech mouth sounds. J. Cogn. Neurosci. 23, 4038–4047 (2011).
Scott, S. K., Blank, C. C., Rosen, S. & Wise, R. J. Identification of a pathway for intelligible speech in the left temporal lobe. Brain J. Neurol. 123, 2400–2406 (2000).
Klein, M. E. & Zatorre, R. J. A role for the right superior temporal sulcus in categorical perception of musical chords. Neuropsychologia 49, 878–887 (2011).
Kyong, J. S. et al. Exploring the roles of spectral detail and intonation contour in speech intelligibility: an fMRI study. J. Cogn. Neurosci. 26, 1748–1763 (2014).
McGettigan, C. & Scott, S. K. Cortical asymmetries in speech perception: what’s wrong, what’s right and what’s left? Trends Cogn. Sci. 16, 269–276 (2012).
McGettigan, C. et al. An application of univariate and multivariate approaches in fMRI to quantifying the hemispheric lateralization of acoustic and linguistic processes. J. Cogn. Neurosci. 24, 636–652 (2012).
Recanzone, G. H. & Sutter, M. L. The biological basis of audition. Annu. Rev. Psychol. 59, 119–142 (2008).
Lewis, J. W. & Van Essen, D. C. Corticocortical connections of visual, sensorimotor, and multimodal processing areas in the parietal lobe of the macaque monkey. J. Comp. Neurol. 428, 112–137 (2000).
Rauschecker, J. P. In Spatial Processing in Navigation, Imagery and Perception (eds Mast, F. & Jäncke, L.) 389–410 (Springer US, 2007).
Vates, G. E. & Nottebohm, F. Feedback circuitry within a song-learning pathway. Proc. Natl Acad. Sci. USA 92, 5139–5143 (1995).
Friederici, A. Language development and the ontogeny of the dorsal pathway. Front. Evol. Neurosci. 4, 3 (2012).
Jasmin, K., Lima, C. F. & Scott, S. K. Understanding rostral–caudal auditory cortex contributions to auditory perception. Nat. Rev. Neurosci. 20, 425–434 (2019).
Hickok, G., Buchsbaum, B., Humphries, C. & Muftuler, T. Auditory–motor interaction revealed by fMRI: speech, music, and working memory in area Spt. J. Cogn. Neurosci. 15, 673–682 (2003).
Pearce, M. T. & Wiggins, G. A. Expectation in melody: the influence of context and learning. Music Percept. 23, 377–405 (2006).
Von Hippel, P. & Huron, D. Why do skips precede reversals? The effect of tessitura on melodic structure. Music Percept. 18, 59–85 (2000).
Russo, F. A. & Cuddy, L. L. A common origin for vocal accuracy and melodic expectancy: vocal constraints. J. Acoust. Soc. Am. 105, 1217–1217 (1999).
Schellenberg, E. G. Simplifying the implication-realization model of melodic expectancy. Music Percept. 14, 295–318 (1997).
Zatorre, R. J. Pitch perception of complex tones and human temporal‐lobe function. J. Acoust. Soc. Am. 84, 566–572 (1988).
Johnsrude, I. S., Penhune, V. B. & Zatorre, R. J. Functional specificity in the right human auditory cortex for perceiving pitch direction. Brain 123, 155–163 (2000).
Meyer, M., Alter, K., Friederici, A. D., Lohmann, G. & von Cramon, D. Y. FMRI reveals brain regions mediating slow prosodic modulations in spoken sentences. Hum. Brain Mapp. 17, 73–88 (2002).
Meyer, M., Steinhauer, K., Alter, K., Friederici, A. D. & von Cramon, D. Y. Brain activity varies with modulation of dynamic pitch variance in sentence melody. Brain Lang. 89, 277–289 (2004).
Zatorre, R. J., Evans, A. C. & Meyer, E. Neural mechanisms underlying melodic perception and memory for pitch. J. Neurosci. 14, 1908–1919 (1994).
Sammler, D., Grosbras, M.-H., Anwander, A., Bestelmeyer, P. E. G. & Belin, P. Dorsal and ventral pathways for prosody. Curr. Biol. 25, 3079–3085 (2015).
Jeffries, K. J., Fritz, J. B. & Braun, A. R. Words in melody: an H215 O PET study of brain activation during singing and speaking. NeuroReport 14, 749–754 (2003).
Tervaniemi, M. & Hugdahl, K. Lateralization of auditory-cortex functions. Brain Res. Rev. 43, 231–246 (2003).
Merrill, J. et al. Perception of words and pitch patterns in song and speech. Front. Psychol. 3, 76 (2012).
Sammler, D. & Elmer, S. Advances in the neurocognition of music and language. Brain Sci. 10, 509 (2020).
Albouy, P., Benjamin, L., Morillon, B. & Zatorre, R. J. Distinct sensitivity to spectrotemporal modulation supports brain asymmetry for speech and melody. Science 367, 1043–1047 (2020).
Geiser, E., Zaehle, T., Jancke, L. & Meyer, M. The neural correlate of speech rhythm as evidenced by metrical speech processing. J. Cogn. Neurosci. 20, 541–552 (2008).
Kasdan, A. V. et al. Identifying a brain network for musical rhythm: a functional neuroimaging meta-analysis and systematic review. Neurosci. Biobehav. Rev. 136, 104588 (2022).
Grahn, J. A. & Brett, M. Rhythm and beat perception in motor areas of the brain. J. Cogn. Neurosci. 19, 893–906 (2007).
Rauschecker, J. P., Tian, B. & Hauser, M. Processing of complex sounds in the macaque nonprimary auditory cortex. Science 268, 111–114 (1995).
Petkov, C. I. et al. A voice region in the monkey brain. Nat. Neurosci. 11, 367–374 (2008).
Belin, P., Zatorre, R. J., Lafaille, P., Ahad, P. & Pike, B. Voice-selective areas in human auditory cortex. Nature 403, 309–312 (2000).
Pernet, C. R. et al. The human voice areas: spatial organization and inter-individual variability in temporal and extra-temporal cortices. NeuroImage 119, 164–174 (2015).
Agus, T. R., Paquette, S., Suied, C., Pressnitzer, D. & Belin, P. Voice selectivity in the temporal voice area despite matched low-level acoustic cues. Sci. Rep. 7, 11526 (2017).
Latinus, M., Crabbe, F. & Belin, P. Learning-induced changes in the cerebral processing of voice identity. Cereb. Cortex 21, 2820–2828 (2011).
Ethofer, T. et al. Emotional voice areas: anatomic location, functional properties, and structural connections revealed by combined fMRI/DTI. Cereb. Cortex 22, 191–200 (2012).
Frühholz, S. & Grandjean, D. Processing of emotional vocalizations in bilateral inferior frontal cortex. Neurosci. Biobehav. Rev. 37, 2847–2855 (2013).
Grossmann, T. The development of emotion perception in face and voice during infancy. Restor. Neurol. Neurosci. 28, 219–236 (2010).
Angulo-Perkins, A. et al. Music listening engages specific cortical regions within the temporal lobes: differences between musicians and non-musicians. Cortex 59, 126–137 (2014).
Fedorenko, E., McDermott, J. H., Norman-Haignere, S. & Kanwisher, N. Sensitivity to musical structure in the human brain. J. Neurophysiol. 108, 3289–3300 (2012).
Norman-Haignere, S., Kanwisher, N. G. & McDermott, J. H. Distinct cortical pathways for music and speech revealed by hypothesis-free voxel decomposition. Neuron 88, 1281–1296 (2015).
Hickok, G. & Poeppel, D. The cortical organization of speech processing. Nat. Rev. Neurosci. 8, 393–402 (2007).
Norman-Haignere, S. V. & McDermott, J. H. Neural responses to natural and model-matched stimuli reveal distinct computations in primary and nonprimary auditory cortex. PLoS Biol. 16, e2005127 (2018).
Norman-Haignere, S., Kanwisher, N., McDermott, J. H. & Conway, B. R. Divergence in the functional organization of human and macaque auditory cortex revealed by fMRI responses to harmonic tones. Nat. Neurosci. 22, 1057–1060 (2019).
Vanden Bosch der Nederlanden, C. M., Hannon, E. E. & Snyder, J. S. Finding the music of speech: musical knowledge influences pitch processing in speech. Cognition 143, 135–140 (2015).
Shoffstall, A. & Capadona, J. R. In Neuromodulation 2nd edn Ch 28 (eds Krames, E. S., Peckham, P. H. & Rezai, A. R.) 393–413 (Academic Press, 2018).
Fifer, W. P. & Moon, C. M. The role of mother’s voice in the organization of brain function in the newborn. Acta Paediatr. Suppl. 397, 86–93 (1994).
Giordano, V. et al. Accent discrimination abilities during the first days of life: an fNIRS study. Brain Lang. 223, 105039 (2021).
Mandel, D. R., Jusczyk, P. W. & Kemler Nelson, D. G. Does sentential prosody help infants organize and remember speech information? Cognition 53, 155–180 (1994).
Kosakowski, H. et al. Preliminary evidence for selective cortical responses to music in one-month-old infants. Dev. Sci. 26, e13387 (2023).
Karmiloff-Smith, B. A. Beyond modularity: a developmental perspective on cognitive science. Eur. J. Disord. Commun. 29, 95–105 (1994).
Patel, A. D. in The Science-Music Borderlands: Reckoning with the Past and Imagining the Future (eds. Margulis, E. H. et al.) 15–38 (The MIT Press, 2023).
Kraus, N. & Banai, K. Auditory-processing malleability: focus on language and music. Curr. Dir. Psychol. Sci. 16, 105–110 (2007).
Vanden Bosch der Nederlanden, C. M. et al. Developmental changes in the categorization of speech and song. Dev. Sci. 26, e13346 (2023).
Dehaene, S. Inside the letterbox: how literacy transforms the human brain. Cerebrum 2013, 7 (2013).
Iuzzini-Seigel, J., Hogan, T. P., Rong, P. & Green, J. R. Longitudinal development of speech motor control: motor and linguistic factors. J. Mot. Learn. Dev. 3, 53–68 (2015).
Alcock, K. The development of oral motor control and language. Down Syndr. Res. Pract. 11, 1–8 (2006).
Norman-Haignere, S. V. et al. Multiscale temporal integration organizes hierarchical computation in human auditory cortex. Nat. Hum. Behav. 6, 455–469 (2022).
Patel, A. D. & Von Rueden, C. Where they sing solo: accounting for cross-cultural variation in collective music-making in theories of music evolution. Behav. Brain Sci. 44, e85 (2021).
Bruckert, L. et al. Vocal attractiveness increases by averaging. Curr. Biol. 20, 116–120 (2010).
Schneider, F. et al. Neuronal figure-ground responses in primate primary auditory cortex. Cell Rep. 35, 109242 (2021).
Proctor, D. F. In Comprehensive Physiology 597–604 (John Wiley & Sons, Ltd, 2011).
Kayes, G. In The Oxford Handbook of Singing (eds Welch, G. F., Howard, D. M. & Nix, J.) https://doi.org/10.1093/oxfordhb/9780199660773.013.019 (Oxford Univ. Press, 2019).
Christiner, M. & Reiterer, S. Song and speech: examining the link between singing talent and speech imitation ability. Front. Psychol. 4, 874 (2013).
Pfordresher, P. Q., Mantell, J. T. & Pruitt, T. A. Effects of intention in the imitation of sung and spoken pitch. Psychol. Res. 86, 792–807 (2022).
Sundberg, J. Formant structure and articulation of spoken and sung vowels. Folia Phoniatr. 22, 28–48 (2009).
Rossi, S. et al. How the brain understands spoken and sung sentences. Brain Sci. 10, 36 (2020).
Leanderson, R., Sundberg, J. & Von Euler, C. Breathing muscle activity and subglottal pressure dynamics in singing and speech. J. Voice 1, 258–261 (1987).
Salomoni, S., van den Hoorn, W. & Hodges, P. Breathing and singing: objective characterization of breathing patterns in classical singers. PLoS ONE 11, e0155084 (2016).
Hoit, J. D., Jenks, C. L., Watson, P. J. & Cleveland, T. F. Respiratory function during speaking and singing in professional country singers. J. Voice 10, 39–49 (1996).
Nishimura, T. The descended larynx and the descending larynx. Anthropol. Sci. 126, 3–8 (2018).
Bosma, J. F. Symposium on Development of the Basicranium (U.S. Department of Health, Education, and Welfare, Public Health Service, National Institutes of Health, 1976).
Fowler, C. A. & Brown, J. M. Intrinsic fo differences in spoken and sung vowels and their perception by listeners. Percept. Psychophys. 59, 729–738 (1997).
Belyk, M. & Brown, S. Perception of affective and linguistic prosody: an ALE meta-analysis of neuroimaging studies. Soc. Cogn. Affect. Neurosci. 9, 1395–1403 (2014).
Dichter, B. K., Breshears, J. D., Leonard, M. K. & Chang, E. F. The control of vocal pitch in human laryngeal motor cortex. Cell 174, 21–31.e9 (2018).
Jürgens, U. The neural control of vocalization in mammals: a review. J. Voice 23, 1–10 (2009).
Callan, D. E. et al. Song and speech: brain regions involved with perception and covert production. NeuroImage 31, 1327–1342 (2006).
Warren, J. E., Wise, R. J. S. & Warren, J. D. Sounds do-able: auditory-motor transformations and the posterior temporal plane. Trends Neurosci. 28, 636–643 (2005).
Rauschecker, J. P. Ventral and dorsal streams in the evolution of speech and language. Front. Evol. Neurosci. 4, 7 (2012).
von Holst, E. & Mittelstaedt, H. Das Reafferenzprinzip. Naturwissenschaften 37, 464–476 (1950).
Bizley, J. K. & Walker, K. M. M. Sensitivity and selectivity of neurons in auditory cortex to the pitch, timbre, and location of sounds. Neuroscientist 16, 453–469 (2010).
Theunissen, F. E. & Elie, J. E. Neural processing of natural sounds. Nat. Rev. Neurosci. 15, 355–366 (2014).
Eggerrmont, J. J. in Auditory Temporal Processing and its Disorders (ed. Eggermont, J. J.) 144–164 (Oxford Univ. Press, 2015).
Cartwright, J. H. E., González, D. L. & Piro, O. Pitch perception: a dynamical-systems perspective. Proc. Natl Acad. Sci. USA 98, 4855–4859 (2001).
McPherson, M. J. & McDermott, J. H. Diversity in pitch perception revealed by task dependence. Nat. Hum. Behav. 2, 52–66 (2018).
Imaizumi, S. et al. Vocal identification of speaker and emotion activates differerent brain regions. NeuroReport 8, 2809 (1997).
Schirmer, A. & Kotz, S. A. Beyond the right hemisphere: brain mechanisms mediating vocal emotional processing. Trends Cogn. Sci. 10, 24–30 (2006).
Morett, L. M. & Chang, L.-Y. Emphasising sound and meaning: pitch gestures enhance Mandarin lexical tone acquisition. Lang. Cogn. Neurosci. 30, 347–353 (2015).
Rosen, S. M., Fourcin, A. J. & Moore, B. C. J. Voice pitch as an aid to lipreading. Nature 291, 150–152 (1981).
Moore, B. C. J. Hearing (Academic Press, 1995).
Uddin, L. Q. Salience Network of the Human Brain 1–4 (Academic Press, 2017).
Warren, J. D. & Griffiths, T. D. Distinct mechanisms for processing spatial sequences and pitch sequences in the human auditory brain. J. Neurosci. 23, 5799–5804 (2003).
Whitfield, I. C. Auditory cortex and the pitch of complex tones. J. Acoust. Soc. Am. 67, 644–647 (1980).
Kazui, S., Naritomi, H., Sawada, T., Inoue, N. & Okuda, J.-I. Subcortical auditory agnosia. Brain Lang. 38, 476–487 (1990).
Tramo, M. J., Shah, G. D. & Braida, L. D. Functional role of auditory cortex in frequency processing and pitch perception. J. Neurophysiol. 87, 122–139 (2002).
Sankaran, N., Thompson, W. F., Carlile, S. & Carlson, T. A. Decoding the dynamic representation of musical pitch from human brain activity. Sci. Rep. 8, 839 (2018).
Penagos, H., Melcher, J. R. & Oxenham, A. J. A neural representation of pitch salience in nonprimary human auditory cortex revealed with functional magnetic resonance imaging. J. Neurosci. 24, 6810–6815 (2004).
Warren, J. D., Uppenkamp, S., Patterson, R. D. & Griffiths, T. D. Separating pitch chroma and pitch height in the human brain. Proc. Natl Acad. Sci. USA 100, 10038–10042 (2003).
Belin, P., Bestelmeyer, P. E. G., Latinus, M. & Watson, R. Understanding voice perception. Br. J. Psychol. 102, 711–725 (2011).
Andics, A., McQueen, J. M. & Petersson, K. M. Mean-based neural coding of voices. NeuroImage 79, 351–360 (2013).
Kanber, E., Lavan, N. & McGettigan, C. Highly accurate and robust identity perception from personally familiar voices. J. Exp. Psychol. Gen. 151, 897–911 (2022).
Puts, D. A., Gaulin, S. J. C. & Verdolini, K. Dominance and the evolution of sexual dimorphism in human voice pitch. Evol. Hum. Behav. 27, 283–296 (2006).
Honjo, I. & Isshiki, N. Laryngoscopic and voice characteristics of aged persons. Arch. Otolaryngol. 106, 149–150 (1980).
Boulet, M. J. & Oddens, B. J. Female voice changes around and after the menopause — an initial investigation. Maturitas 23, 15–21 (1996).
Abdelli-Beruh, N. B., Wolk, L. & Slavin, D. Prevalence of vocal fry in young adult male American English speakers. J. Voice 28, 185–190 (2014).
Dodd, B., Holm, A., Zhu, H. & Crosbie, S. Phonological development: a normative study of British English-speaking children. Clin. Linguist. Phon. 17, 617–643 (2004).
Elardo, R., Bradley, R. & Caldwell, B. M. A longitudinal study of the relation of infants’ home environments to language development at age three. Child Dev. 48, 595–603 (1977).
Bloom, L. et al. Structure and variation in child language. Monogr. Soc. Res. Child Dev. 40, 1–97 (1975).
Fitch, W. T. & Giedd, J. Morphology and development of the human vocal tract: a study using magnetic resonance imaging. J. Acoust. Soc. Am. 106, 1511–1522 (1999).
Latinus, M., McAleer, P., Bestelmeyer, P. E. G. & Belin, P. Norm-based coding of voice identity in human auditory cortex. Curr. Biol. 23, 1075–1080 (2013).
Yan, W.-J., Wu, Q., Liang, J., Chen, Y.-H. & Fu, X. How fast are the leaked facial expressions: the duration of micro-expressions. J. Nonverbal Behav. 37, 217–230 (2013).
Conde, T. et al. The time course of emotional authenticity detection in nonverbal vocalizations. Cortex 151, 116–132 (2022).
Sauter, D. A., Eisner, F., Calder, A. J. & Scott, S. K. Perceptual cues in nonverbal vocal expressions of emotion. Q. J. Exp. Psychol. 63, 2251–2272 (2010).
Buck, R., Losow, J. I., Murphy, M. M. & Costanzo, P. Social facilitation and inhibition of emotional expression and communication. J. Pers. Soc. Psychol. 63, 962–968 (1992).
Provine, R. R. & Fischer, K. R. Laughing, smiling, and talking: relation to sleeping and social context in humans. Ethology 83, 295–305 (1989).
Hawkins, S., Cross, I. & Ogden, R. In Language, Music and Interaction (eds Orwin, M., Howes, C. & Kempson, R.) 285–329 (College Publications, 2013).
Ogden, R. & Hawkins, S. Entrainment as a basis for co-ordinated actions in speech. In Proc. 18th International Congress of Phonetic Sciences (ed. The Scottish Consortium for ICPhS 2015) 0599 (The University of Glasgow, 2015).
Cross, I. Music and communication in music psychology. Psychol. Music 42, 809–819 (2014).
Garrod, S. & Pickering, M. J. Joint action, interactive alignment, and dialog. Top. Cogn. Sci. 1, 292–304 (2009).
Kaukomaa, T., Peräkylä, A. & Ruusuvuori, J. How listeners use facial expression to shift the emotional stance of the speaker’s utterance. Res. Lang. Soc. Interact. 48, 319–341 (2015).
Acknowledgements
The authors thank the past and present members of the ICN Speech and Communication Laboratory for wonderfully stimulating lab meetings. The first author further thanks UCL for its generous GRS/ORS funding opportunity and I. Cross for his past mentorship and very interesting e-mail discussions.
Author information
Authors and Affiliations
Contributions
S.K.S. and I.H. researched data for the article, wrote the article, and reviewed and/or edited the manuscript before submission. All authors contributed substantially to discussion of the content.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Reviews Neuroscience thanks Christina Vanden Bosch der Nederlanden and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Harris, I., Niven, E.C., Griffin, A. et al. Is song processing distinct and special in the auditory cortex?. Nat. Rev. Neurosci. 24, 711–722 (2023). https://doi.org/10.1038/s41583-023-00743-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41583-023-00743-4
This article is cited by
-
Spectro-temporal acoustical markers differentiate speech from song across cultures
Nature Communications (2024)