The recognition of spoken language has typically been studied by focusing on either words or their constituent elements (for example, low-level features or phonemes). More recently, the ‘temporal mesoscale’ of speech has been explored, specifically regularities in the envelope of the acoustic signal that correlate with syllabic information and that play a central role in production and perception processes. The temporal structure of speech at this scale is remarkably stable across languages, with a preferred range of rhythmicity of 2– 8 Hz. Importantly, this rhythmicity is required by the processes underlying the construction of intelligible speech. A lot of current work focuses on audio-motor interactions in speech, highlighting behavioural and neural evidence that demonstrates how properties of perceptual and motor systems, and their relation, can underlie the mesoscale speech rhythms. The data invite the hypothesis that the speech motor cortex is best modelled as a neural oscillator, a conjecture that aligns well with current proposals highlighting the fundamental role of neural oscillations in perception and cognition. The findings also show motor theories (of speech) in a different light, placing new mechanistic constraints on accounts of the action–perception interface.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Translational Psychiatry Open Access 04 December 2023
Auditory-motor synchronization varies among individuals and is critically shaped by acoustic features
Communications Biology Open Access 21 June 2023
Nature Communications Open Access 25 May 2023
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Ladefoged, P. A Course in Phonetics (Harcourt Brace, 1993).
Greenberg, S. & Ainsworth, W. A. (eds) Listening to Speech: An Auditory Perspective (Psychology Press, 2012).
Stevens, K. N. Acoustic Phonetics (MIT Press, 2000).
Mesgarani, N., Cheung, C., Johnson, K., & Chang, E. F. Phonetic feature encoding in human superior temporal gyrus. Science, 343, 1006–1010 (2014)
Marslen-Wilson, W. D. Functional parallelism in spoken word-recognition. Cognition 25, 71–102 (1987).
Guenther, F. H. Neural Control of Speech (MIT Press, 2016).
Levelt, W. J. M. Speaking: From Intention to Articulation (MIT Press, 1993).This foundational book describes in detail the many steps involved in spoken language production.
Greenberg, S., Carvey, H., Hitchcock, L., & Chang, S. Temporal properties of spontaneous speech — a syllable-centric perspective. J. Phonetics 31, 465–485 (2003).
Goswami, U., & Leong, V. Speech rhythm and temporal structure: converging perspectives. Lab. Phon. 4, 67–92 (2013).
Ding, N. et al. Temporal modulations in speech and music. Neurosci. Biobehav. Rev. 81, 181–187 (2017). This study includes an analysis of several large speech and music corpora demonstrating the acoustic regular modulation rate of these basic signal types.
Houtgast, T., & Steeneken, H. J. A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. J. Acoust. Soc. Am. 77, 1069–1077 (1985).
Varnet, L., Ortiz-Barajas, M. C., Erra, R. G., Gervain, J. & Lorenzi, C. A cross-linguistic study of speech modulation spectra. J. Acoust. Soc. Am. 142, 1976–1989 (2017). Together with Ding et al. (2017), this paper reveals that signal processing for a wide variety of languages shows the temporal regularity of continuous speech.
Drullman, R., Festen, J. M., & Plomp, R. Effect of temporal envelope smearing on speech reception. J. Acoust. Soc. Am. 95, 1053–1064 (1994).
Elliott, T. M., & Theunissen, F. E. The modulation transfer function for speech intelligibility. PLOS Comput. Biol. 5, e1000302 (2009).
Clarke, J., & Voss, R. 1/f noise. music and speech. Nature 258, 317–318 (1975).
Drullman, R. in Listening to Speech: An Auditory Perspective (eds Greenberg, S. & Ainsworth, W.) ch. 3 (Tailor & Francis, 2012).
Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A. & Ghazanfar, A. A. The natural statistics of audiovisual speech. PLOS Comput. Biol. 5, e1000436 (2009).
Titze, I. R. Principles of Voice Production (Prentice Hall, 1994).
Sanders, I., & Mu, L. A three‐dimensional atlas of human tongue muscles. Anat. Rec. 296, 1102–1114 (2013).
Maeda, S. in Speech Production and Speech Modelling 2(eds Hardcastle, W. J. & Marchal, A.) 63–403 (Springer, 2012).
Story, B. & Titze, I. R. Parametrization of vocal tract area functions by empirical orthogonal modes. Natl. Cent. Voice Speech Status Prog. Rep. 10, 9–23 (1996).
Assaneo, M. F., Ramirez Butavand, D., Trevisan, M. A., & Mindlin, G. B. Discrete anatomical coordinates for speech production and synthesis. Front. Commun. 4, 13 (2019).
Bocquelet, F., Hueber, T., Girin, L., Savariaux, C. & Yvert, B. Real-time control of an articulatory-based speech synthesizer for brain computer interfaces. PLOS Comput. Biol. 12, e1005119 (2016).
Abbs, J. H., Gracco, V. L., & Cole, K. J. Control of multimovement coordination: Sensorimotor mechansims in Speech motor programming. J. Mot Behav. 16, 195–232 (1984).
Browman, C. P., & Goldstein, L. Articulatory phonology: An overview. Phonetica 49, 155–180 (1992).
Hughes, O. M., & Abbs, J. H. Labial-mandibular coordination in the production of speech: Implications for the operation of motor equivalence. Phonetica 33, 199–221 (1976).
Chartier, J., Anumanchipalli, G. K., Johnson, K., & Chang, E. F. Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex. Neuron 98, 1042–1054 (2018).
Walsh, B., & Smith, A. Articulatory movements in adolescents. J. Speech Lang. Hear. R. 45, 1119–1133 (2002).
Chakraborty, R., Goffman, L., & Smith, A. Physiological indices of bilingualism: Oral–motor coordination and speech rate in Bengali–English speakers. J. Speech Lang. Hear. R. 51, 321–332 (2008).
Riely, R. R., & Smith, A. Speech movements do not scale by orofacial structure size. J. Appl. Physiol. 94, 2119–2126 (2003).
Bennett, J. W., Van Lieshout, P. H., & Steele, C. M. Tongue control for speech and swallowing in healthy younger and older adults Int. J. Orofac. Myol. 33, 5–18.(2007).
Lindblad, P., Karlsson, S., & Heller, E. Mandibular movements in speech phrases — A syllabic quasiregular continuous oscillation. Logop. Phoniatr. Vocol. 16, 36–42 (1991).
Ohala, J. J. The temporal regulation of speech. Auditory analysis and perception of speech, (eds. G. Fant, M. A. A. Tatham) 431–453 (Academic Press 1975).
Cummins, F. Oscillators and syllables: a cautionary note. Front. Psychol. 3, 364 (2012).
Ghitza, O. The theta-syllable: a unit of speech information defined by cortical function. Front. Psychol. 4, 138 (2013).
Strauß, A., & Schwartz, J. L. The syllable in the light of motor skills and neural oscillations. Lang. Cogn. Neurosci. 32, 562–569 (2017).
Mehler, J. The role of syllables in speech processing: Infant and adult data. Philos. T. R. Soc. B. Biol. Sci. 295, 333–352 (1981).
Hooper, J. B. The syllable in phonological theory. Language 48, 525–540 (1972).
Eimas, P. D. Segmental and syllabic representations in the perception of speech by young infants. J. Acoust. Soc. Am. 105, 1901–1911 (1999).
Liberman, I. Y., Shankweiler, D., Fischer, F. W., & Carter, B. Explicit syllable and phoneme segmentation in the young child. J. Exp. Child Psychol. 18, 201–212 (1974).
Ziegler, W., Aichert, I., & Staiger, A. Syllable-and rhythm-based approaches in the treatment of apraxia of speech. Perspec. Neurophysiol. Neurogenic Speech Lang. Disord. 20, 59–66 (2010).
Carreiras, M., & Perea, M. Naming pseudowords in Spanish: Effects of syllable frequency. Brain Lang. 90, 393–400 (2004).
Cholin, J., Levelt, W. J., & Schiller, N. O. Effects of syllable frequency in speech production. Cognition 99, 205–235 (2006).
Guenther, F. H., Ghosh, S. S. & Tourville, J. A. Neural modeling and imaging of the cortical interactions underlying syllable production. Brain Lang. 96, 280–301 (2006). This paper uses neuroimaging data and computational modelling to highlight the complex steps and brain regions implicated in syllable acquisition and production.
Jessen, M. Forensic reference data on articulation rate in German. Sci. Justice 47, 50–67 (2007).
Fosler-Lussier, E., & Morgan, N. Effects of speaking rate and word frequency on pronunciations in convertional speech. Speech Commun. 29, 137–158 (1999).
Pellegrino, F., Coupé, C. & Marsico, E. Across-language perspective on speech information rate. Language 87, 539–558 (2011).
Jacewicz, E., Fox, R. A., O’Neill, C., & Salmons, J.Articulation rate across dialect, age, and gender. Lang. Var. Change 21, 233–256 (2009).
Künzel, H. J. Some general phonetic and forensic aspects of speaking tempo. Int. J. Speech Lang. Law 4, 48–83 (1997).
Ramig, L. A., & Ringel, R. L. Effects of physiological aging on selected acoustic characteristics of voice. J. Speech Lang. Hear. R. 26, 22–30 (1983).
Clopper, C. G., & Smiljanic, R. Effects of gender and regional dialect on prosodic patterns in American English. J. Phon. 39, 237–245 (2011).
He, L., & Dellwo, V. Amplitude envelope kinematics of speech: Parameter extraction and applications. J. Acoust. Soc. Am. 141, 3582–3582 (2017).
Mermelstein, P. Automatic segmentation of speech into syllabic units. J. Acoust. Soc. Am. 58, 880–883 (1975).
Tilsen, S., & Arvaniti, A. Speech rhythm analysis with decomposition of the amplitude envelope: characterizing rhythmic patterns within and across languages. J. Acoust. Soc. Am. 134, 628–639 (2013).
Titze, I. R. Measurements for voice production: research and clinical applications. J. Acoust. Soc. Am. 104, 1148 (1998).
Amador, A., Perl, Y. S., Mindlin, G. B. & Margoliash, D. Elemental gesture dynamics are encoded by song premotor cortical neurons. Nature 495, 59–64 (2013).
Norton, P., & Scharff, C. “Bird song Metronomics”: isochronous organization of zebra finch song rhythm. Front. Neurosci. 10, 309 (2016).
Ahissar, E. et al. Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc. Natl Acad. Sci. USA 98, 13367–13372 (2001).
Luo, H. & Poeppel, D. Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron 54, 1001–1010 (2007).
Henry, M. J., & Obleser, J. Frequency modulation entrains slow neural oscillations and optimizes human listening behavior. Proc. Natl Acad. Sci. USA 109, 20095–20100 (2012).
Lakatos, P., Gross, J., & Thut, G. A new unifying account of the roles of neuronal entrainment. Curr. Biol. 29, R890–R905 (2019).
Peña, M., & Melloni, L. Brain oscillations during spoken sentence processing. J. Cogn. Neurosci. 24, 1149–1164 (2012).
Howard, M. F., & Poeppel, D. Discrimination of speech stimuli based on neuronal response phase patterns depends on acoustics but not comprehension. J. Neurophysiol. 104, 2500–2511 (2010).
Golumbic, E. M. Z., et al. Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party”. Neuron 77, 980–991 (2013).
Ding, N., & Simon, J. Z. Emergence of neural encoding of auditory objects while listening to competing speakers. Proc. Natl Acad. Sci. USA 109, 11854–11859 (2012).
Broderick, M. P., Anderson, A. J., & Lalor, E. C. Semantic context enhances the early auditory encoding of natural speech. J. Neurosci. 39, 7564–7575 (2019).
Assaneo, M. F. et al. The lateralization of speech–brain coupling is differentially modulated by intrinsic auditory and top-down mechanisms. Front. Integr. Neurosci. 13, 28 (2019).
Peelle, J. E., & Davis, M. H. Neural oscillations carry speech rhythm through to comprehension. Front. Psychol. 3, 320 (2012).
Capilla, A., Pazo-Alvarez, P., Darriba, A., Campo, P., & Gross, J. Steady-state visual evoked potentials can be explained by temporal superposition of transient event-related responses. PLOS ONE 6, e0014543 (2011).
Doelling, K. B., Assaneo, M. F., Bevilacqua, D., Pesaran, B., & Poeppel, D. An oscillator model better predicts cortical entrainment to music. Proc. Natl Acad. Sci. USA 116, 10113–10121 (2019).
Giraud, A. L., & Poeppel, D. Cortical oscillations and speech processing: emerging computational principles and operations. Nat. Neurosci. 15, 511–517 (2012). This article provides a perspective on how oscillatory neural activity may form the basis of segmenting speech to create units appropriate for cortical processing.
Gross, J. et al. Speech rhythms and multiplexed oscillatory sensory coding in the human brain. PLOS Biol. 11, e1001752 (2013). This work presents neurophysiological data revealing a nested hierarchy of entrained cortical oscillations underlying the segmentation and coding of spoken language.
Peelle, J. E., Gross, J. & Davis, M. H. Phase-locked responses to speech in human auditory cortex are enhanced during comprehension. Cereb. Cortex 23, 1378–1387 (2013).
Doelling, K. B., Arnal, L. H., Ghitza, O. & Poeppel, D. Acoustic landmarks drive delta–theta oscillations to enable speech comprehension by facilitating perceptual parsing. Neuroimage 85, 761–768 (2014).
Abrams, D. A., Nicol, T., Zecker, S., & Kraus, N. Abnormal cortical processing of the syllable rate of speech in poor readers. J. Neurosci. 29, 7686–7693 (2009).
Cutini, S., Szűcs, D., Mead, N., Huss, M., & Goswami, U. Atypical right hemisphere response to slow temporal modulations in children with developmental dyslexia. Neuroimage 143, 40–49 (2016).
Wilsch, A., Neuling, T., Obleser, J., & Herrmann, C. S. Transcranial alternating current stimulation with speech envelopes modulates speech comprehension. Neuroimage 172, 766–774 (2018).
Zoefel, B., Archer-Boyd, A., & Davis, M. H. Phase entrainment of brain oscillations causally modulates neural responses to intelligible speech. Curr. Biol. 28, 401–408 (2018).
Riecke, L., Formisano, E., Sorger, B., Başkent, D., & Gaudrain, E. Neural entrainment to speech modulates speech intelligibility. Curr. Biol. 28, 161–169 (2018).
Luo, H., Wang, Y., Poeppel, D., & Simon, J. Z. Concurrent encoding of frequency and amplitude modulation in human auditory cortex: Encoding transition. J. Neurophysiol. 98, 3473–3485 (2007).
Viemeister, N. F. Temporal modulation transfer functions based upon modulation thresholds. J. Acoust. Soc. Am. 66, 1364–1380 (1979).
Zwicker, E. Die Grenzen der Hörbarkeit der Amplitudenmodulation und der Frequenzmodulation eines Tones [The limits of perceptibility of the amplitude-modulation and the frequency-modulation of a tone]. Akust. Beih. 2 (Suppl. 3), 125–133 (1952).
Giraud, A. L. et al. Representation of the temporal envelope of sounds in the human brain. J. Neurophysiol. 84, 1588–1598 (2000).
Boemio, A., Fromm, S., Braun, A. & Poeppel, D. Hierarchical and asymmetric temporal sensitivity in human auditory cortices. Nat. Neurosci. 8, 389–395 (2005).
Teng, X., Tian, X., Rowland, J., & Poeppel, D. Concurrent temporal channels for auditory processing: Oscillatory neural entrainment reveals segregation of function at different scales. PLOS Biol. 15, e2000812 (2017).
Liégeois-Chauvel, C., Lorenzi, C., Trébuchon, A., Régis, J., & Chauvel, P. Temporal envelope processing in the human left and right auditory cortices. Cereb. Cortex 14, 731–740 (2004).
Overath, T., Zhang, Y., Sanes, D. H. & Poeppel, D. Sensitivity to temporal modulation rate and spectral bandwidth in the human auditory system: fMRI evidence. J. Neurophysiol. 107, 2042–2056 (2012).
Versfeld, N. J., & Dreschler, W. A. The relationship between the intelligibility of time-compressed speech and speech in noise in young and elderly listeners. J. Acoust. Soc. Am. 111, 401–408 (2002).
Trouvain, J. On the comprehension of extremely fast synthetic speech. Saarl. Work. Pap. Linguist. 1, 5–13 (2007).
Ghitza, O., & Greenberg, S. On the possible role of brain rhythms in speech perception: intelligibility of time-compressed speech with periodic and aperiodic insertions of silence. Phonetica 66, 113–126 (2009). This paper presents an innovative behavioural design using speech compression that highlights the relevance of syllable-sized units for intelligibility.
Wilson, S. M., Saygin, A. P., Sereno, M. I. & Iacoboni, M. Listening to speech activates motor areas involved in speech production. Nat. Neurosci. 7, 701–702 (2004).
D’Ausilio, A. et al. The motor somatotopy of speech perception. Curr. Biol. 19, 381–385 (2009).
Du, Y., Buchsbaum, B. R., Grady, C. L. & Alain, C. Noise differentially impacts phoneme representations in the auditory and speech motor systems. Proc. Natl Acad. Sci. USA 111, 7126–7131 (2014).
Houde, J. F. & Jordan, M. I. Sensorimotor adaptation in speech production. Science 279, 1213–1216 (1998).
Black, J. W. The effect of delayed side-tone upon vocal rate and intensity. J. Speech Disord. 16, 56–60 (1951). This study is a first to demonstrate that delayed auditory feedback compromises and slows down speech production.
Flinker, A. et al. Single-trial speech suppression of auditory cortex activity in humans. J. Neurosci. 30, 16643–16650 (2010).
Tian, X., & Poeppel, D. The effect of imagination on stimulation: the functional specificity of efference copies in speech processing. J. Cogn. Neurosci. 25, 1020–1036 (2013).
Park, H., Ince, R. A. A., Schyns, P. G., Thut, G. & Gross, J. Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners. Curr. Biol. 25, 1649–1653 (2015).
Onojima, T., Kitajo, K., & Mizuhara, H. Ongoing slow oscillatory phase modulates speech intelligibility in cooperation with motor cortical activity. PLOS ONE 12, e0183146 (2017).
Rimmele, J. M., Sun, Y., Michalareas, G., Ghitza, O. & Poeppel, D. Dynamics of functional networks for syllable and word-level processing. BioRxiv https://doi.org/10.1101/584375 (2019).
Cope, T. E. et al. Evidence for causal top-down frontal contributions to predictive processes in speech perception. Nat. Commun. 8, 1–16 (2017).
Kovelman, I. et al. Brain basis of phonological awareness for spoken language in children and its disruption in dyslexia. Cereb. Cortex 22, 754–764 (2012).
Molinaro, N., Lizarazu, M., Lallier, M., Bourguignon, M., & Carreiras, M. Out-of-synchrony speech entrainment in developmental dyslexia. Hum. Brain Mapp. 37, 2767–2783 (2016).
Keitel, A., Gross, J., & Kayser, C. Perceptually relevant speech tracking in auditory and motor cortex reflects distinct linguistic features. PLOS Biol. 16, e2004473 (2018).
Rimmele, J. M., Morillon, B., Poeppel, D., & Arnal, L. H. Proactive sensing of periodic and aperiodic auditory patterns. Trends Cogn. Sci. 22, 870–882 (2018).
Assaneo, M. F. & Poeppel, D. The coupling between auditory and motor cortices is rate-restricted: evidence for an intrinsic speech–motor rhythm. Sci. Adv. 4, eaao3842 (2018). This study uses neural data and modelling to show how the auditory and speech–motor systems are coupled in phase most strongly at a time scale corresponding roughly to syllable duration.
Hoppensteadt, F. C. & Izhikevich, E. M. Weakly Connected Neural Networks (Springer, 1997).
Giraud, A. L. et al. Endogenous cortical rhythms determine cerebral specialization for speech perception and production. Neuron 56, 1127–1134 (2007).
Keitel, A., & Gross, J. Individual human brain areas can be identified from their characteristic spectral activation fingerprints. PLOS Biol. 14, e1002498 (2016).
Lee, B. S. Effects of delayed speech feedback. J. Acoust. Soc. Am. 22, 824–826 (1950).
Assaneo, M. F. et al. Spontaneous synchronization to speech reveals neural mechanisms facilitating language learning. Nature Neurosci. 22, 627–632 (2019). This study uses an uncomplicated behavioural speech synchronization test to show how subjects differ anatomically and physiologically in their ability to align their sensorimotor systems.
Stuart, A., Kalinowski, J., Rastatter, M. P. & Lynch, K. Effect of delayed auditory feedback on normal speakers at two speech rates. J. Acoust. Soc. Am. 111, 2237 (2002).
Saffran, J. R., Aslin, R. N. & Newport, E. L. Statistical learning by 8-month-old infants. Science 274, 1926–1928 (1996).
Hickok, G. & Poeppel, D. The cortical organization of speech processing. Nat. Rev. Neurosci. 8, 393–402 (2007).
Magrassi, L., Aromataris, G., Cabrini, A., Annovazzi-Lodi, V. & Moro, A. Sound representation in higher language areas during language generation. Proc. Natl Acad. Sci. USA 112, 1868–1873 (2015).
Long, M. A. et al. Functional segregation of cortical regions underlying speech timing and articulation. Neuron 89, 1187–1193 (2016).
Wilson, H. R., & Cowan, J. D. Excitatory and inhibitory interactions in localized populations of model neurons. Biophys. J. 12, 1–24 (1972).
Buzsáki, G., Logothetis, N., & Singer, W. Scaling brain size, keeping timing: evolutionary preservation of brain rhythms. Neuron 80, 751–764 (2013).
Laje, R. & Mindlin, G. B. The Physics of Birdsong (Springer-Verlag, 2005).
MacNeilage, P. F. The frame/content theory of evolution of speech production. Behav. Brain Sci. 21, 499–511 (1998). This paper describes an influential theory on how evolution privileged syllables as the basic units of spoken language.
De Boysson-Bardies, B., Bacri, N., Sagart, L., & Poizat, M. Timing in late babbling. J. Child Lang. 8, 525–539 (1981).
Ghazanfar, A. A., Takahashi, D. Y., Mathur, N., & Fitch, W. T. Cineradiography of monkey lip-smacking reveals putative precursors of speech dynamics. Curr. Biol. 22, 1176–1182 (2012).
Brooks, J. X., & Cullen, K. Predictive sensing: The role of motor signals in sensory processing. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 4, 842–850 (2019).
Schroeder, C. E., Wilson, D. A., Radman, T., Scharfman, H., & Lakatos, P. Dynamics of active sensing and perceptual selection. Curr. Opin. Neurobiol. 20, 172–176 (2010). This article advances the perspective that motor systems can play an integral role in shaping perceptual processes by sampling the input.
Wesson, D. W., Verhagen, J. V., & Wachowiak, M. Why sniff fast? The relationship between sniff frequency, odor discrimination, and receptor neuron activation in the rat. J. Neurophysiol. 101, 1089–1102 (2009).
Huston, S. J., Stopfer, M., Cassenaer, S., Aldworth, Z. N., & Laurent, G. Neural encoding of odors during active sampling and in turbulent plumes. Neuron 88, 403–418 (2015).
Lederman, S. J. Tactual roughness perception: spatial and temporal determinants. Can. J. Psychol. 37, 498 (1983).
Deschênes, M., Moore, J., & Kleinfeld, D. Sniffing and whisking in rodents. Curr. Opin. Neurobiol. 22, 243–250 (2012).
Fiebelkorn, I. C., & Kastner, S. A rhythmic theory of attention. Trends Cogn. Sci. 23, 87–101 (2019).
Gagl, B. et al. Reading at the speed of speech: the rate of eye movements aligns with auditory language processing. bioRxiv https://doi.org/10.1101/391896 (2018).
Tierney, A., & Kraus, N. Auditory-motor entrainment and phonological skills: precise auditory timing hypothesis (PATH). Front. Hum. Neurosci. 8, 949 (2014).
Wrench, A. MOCHA-TIMIT database (CSTR, Univ. of Edinburgh, 1999).
Indefrey, P. & Levelt, W. J. M. in The New Cognitive Neurosciences (ed. Gazzaniga, M. S.) 845–866 (MIT Press, 2000).
Bouchard, K. E., Mesgarani, N., Johnson, K. & Chang, E. F. Functional organization of human sensorimotor cortex for speech articulation. Nature 495, 327–332 (2013).
Tremblay, P., & Small, S. L. Motor response selection in overt sentence production: a functional MRI study. Front. Psychol. 2, 253 (2011).
Hickok, G., Buchsbaum, B., Humphries, C. & Muftuler, T. Auditory–motor interaction revealed by fMRI: speech, music, and working memory in area Spt. J. Cogn. Neurosci. 15, 673–682 (2003).
Brennan, J., & Pylkkänen, L. The time-course and spatial distribution of brain activity associated with sentence processing. Neuroimage 60, 1139–1148 (2012).
Lau, E. F., Phillips, C., & Poeppel, D. A cortical network for semantics:(de) constructing the N400. Nat. Rev. Neurosci. 9, 920–933 (2008).
Catani, M., & De Schotten, M. T. A diffusion tensor imaging tractography atlas for virtual in vivo dissections. Cortex 44, 1105–1132 (2008).
Liberman, A. M., & Mattingly, I. G. The motor theory of speech perception revised. Cognition 21, 1–36 (1985).
Lotto, A. J., Hickok, G. S. & Holt, L. L. Reflections on mirror neurons and speech perception. Trends Cogn. Sci. 13, 110–114 (2009).
Skipper, J. I., Devlin, J. T., & Lametti, D. R. The hearing ear is always found close to the speaking tongue: Review of the role of the motor system in speech perception. Brain Lang. 164, 77–105 (2017).
Lane, H. The motor theory of speech perception: A critical review. Psychol. Rev. 72, 275 (1965).
Rauschecker, J. P. & Scott, S. K. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat. Neurosci. 12, 718–724 (2009).
Friederici, A. D. Pathways to language: fiber tracts in the human brain. Trends Cogn. Sci. 13, 175–181 (2009).
Dick, A. S., Bernal, B., & Tremblay, P. The language connectome: new pathways, new concepts. Neuroscientist 20, 453–467 (2014).
Saur, D. et al. Ventral and dorsal pathways for language. Proc. Natl Acad. Sci. USA 105, 18035–18040 (2008). This study presents some of the first anatomical data to demonstrate that there are distinct ventral and dorsal pathways underpinning language processing.
Wong, F. C., Chandrasekaran, B., Garibaldi, K., & Wong, P. C. White matter anisotropy in the ventral language pathway predicts sound-to-word learning success. J. Neurosci. 31, 8780–8785 (2011).
Brauer, J., Anwander, A., Perani, D., & Friederici, A. D. Dorsal and ventral pathways in language development. Brain Lang. 127, 289–295 (2013).
Catani, M., & De Schotten, M. T. A diffusion tensor imaging tractography atlas for virtual in vivo dissections. Cortex 44, 1105–1132 (2008).
The authors thank O. Ghitza and J. Orpella for valuable feedback. They acknowledge the support of the Max Planck Society and NIH R01DC05660.
The authors declare no competing interests.
Peer reviewer information
Nature Reviews Neuroscience thanks J. Gross, G. Mindlin and the other anonymous reviewer for their contribution to the peer review of this work.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- Distinctive features
Stable auditory and/or articulatory patterns that distinguish phonemes, for example ‘voicing’ in /b/ versus /p/.
Brief segments of speech that have characteristic physical or perceptual attributes.
The speech elements of a language (vowels and consonants) that encode words.
The process of chunking the continuous acoustic stream of spoken language into units.
Mapping the segmented acoustic chunks into linguistic units (phonemes, syllables or words) stored in the mental dictionary.
- Audio-motor integration
The alignment or merging of information computed in the auditory and (speech) motor systems.
A visualization of how the frequency composition of a signal evolves over time.
- Critical band filtering
Decomposing a signal into different frequency bands defined according to the frequency response of the relevant biophysical system.
- 1/f noise spectrum
The power spectrum of noise decreases with frequency, an attribute of many biological signals.
A representation of how much energy a signal carries in each frequency band.
- Vocal tract
The set of anatomical cavities above the larynx that shape the production of speech.
Part of the roof of the oral cavity comprising connective tissue and muscle, also called the soft palate.
A basic unit of spoken language, typically comprising a vowel (energy peak) with adjoining consonants (for example, /bar/), and thus a short sequence of speech sounds.
The synchronization of brain activity to the temporal structure of a stimulus or between the activity of neural elements.
About this article
Cite this article
Poeppel, D., Assaneo, M.F. Speech rhythms and their neural foundations. Nat Rev Neurosci 21, 322–334 (2020). https://doi.org/10.1038/s41583-020-0304-4
This article is cited by
Translational Psychiatry (2023)
npj Parkinson's Disease (2023)
Auditory-motor synchronization varies among individuals and is critically shaped by acoustic features
Communications Biology (2023)
Nature Communications (2023)
Scientific Reports (2022)