Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Speech rhythms and their neural foundations

Subjects

Abstract

The recognition of spoken language has typically been studied by focusing on either words or their constituent elements (for example, low-level features or phonemes). More recently, the ‘temporal mesoscale’ of speech has been explored, specifically regularities in the envelope of the acoustic signal that correlate with syllabic information and that play a central role in production and perception processes. The temporal structure of speech at this scale is remarkably stable across languages, with a preferred range of rhythmicity of 2– 8 Hz. Importantly, this rhythmicity is required by the processes underlying the construction of intelligible speech. A lot of current work focuses on audio-motor interactions in speech, highlighting behavioural and neural evidence that demonstrates how properties of perceptual and motor systems, and their relation, can underlie the mesoscale speech rhythms. The data invite the hypothesis that the speech motor cortex is best modelled as a neural oscillator, a conjecture that aligns well with current proposals highlighting the fundamental role of neural oscillations in perception and cognition. The findings also show motor theories (of speech) in a different light, placing new mechanistic constraints on accounts of the action–perception interface.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Different timescale representations of a generic acoustic speech signal.
Fig. 2: Speech production exhibits rhythmicity.
Fig. 3: Cortical structures supporting spoken language and sensorimotor interaction.
Fig. 4: Speech production system can be modelled as an oscillator.

Similar content being viewed by others

References

  1. Ladefoged, P. A Course in Phonetics (Harcourt Brace, 1993).

  2. Greenberg, S. & Ainsworth, W. A. (eds) Listening to Speech: An Auditory Perspective (Psychology Press, 2012).

  3. Stevens, K. N. Acoustic Phonetics (MIT Press, 2000).

  4. Mesgarani, N., Cheung, C., Johnson, K., & Chang, E. F. Phonetic feature encoding in human superior temporal gyrus. Science, 343, 1006–1010 (2014)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Marslen-Wilson, W. D. Functional parallelism in spoken word-recognition. Cognition 25, 71–102 (1987).

    Article  CAS  PubMed  Google Scholar 

  6. Guenther, F. H. Neural Control of Speech (MIT Press, 2016).

  7. Levelt, W. J. M. Speaking: From Intention to Articulation (MIT Press, 1993).This foundational book describes in detail the many steps involved in spoken language production.

  8. Greenberg, S., Carvey, H., Hitchcock, L., & Chang, S. Temporal properties of spontaneous speech — a syllable-centric perspective. J. Phonetics 31, 465–485 (2003).

    Article  Google Scholar 

  9. Goswami, U., & Leong, V. Speech rhythm and temporal structure: converging perspectives. Lab. Phon. 4, 67–92 (2013).

    Google Scholar 

  10. Ding, N. et al. Temporal modulations in speech and music. Neurosci. Biobehav. Rev. 81, 181–187 (2017). This study includes an analysis of several large speech and music corpora demonstrating the acoustic regular modulation rate of these basic signal types.

    Article  PubMed  Google Scholar 

  11. Houtgast, T., & Steeneken, H. J. A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. J. Acoust. Soc. Am. 77, 1069–1077 (1985).

    Article  Google Scholar 

  12. Varnet, L., Ortiz-Barajas, M. C., Erra, R. G., Gervain, J. & Lorenzi, C. A cross-linguistic study of speech modulation spectra. J. Acoust. Soc. Am. 142, 1976–1989 (2017). Together with Ding et al. (2017), this paper reveals that signal processing for a wide variety of languages shows the temporal regularity of continuous speech.

    Article  PubMed  Google Scholar 

  13. Drullman, R., Festen, J. M., & Plomp, R. Effect of temporal envelope smearing on speech reception. J. Acoust. Soc. Am. 95, 1053–1064 (1994).

    Article  CAS  PubMed  Google Scholar 

  14. Elliott, T. M., & Theunissen, F. E. The modulation transfer function for speech intelligibility. PLOS Comput. Biol. 5, e1000302 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  15. Clarke, J., & Voss, R. 1/f noise. music and speech. Nature 258, 317–318 (1975).

    Article  Google Scholar 

  16. Drullman, R. in Listening to Speech: An Auditory Perspective (eds Greenberg, S. & Ainsworth, W.) ch. 3 (Tailor & Francis, 2012).

  17. Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A. & Ghazanfar, A. A. The natural statistics of audiovisual speech. PLOS Comput. Biol. 5, e1000436 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  18. Titze, I. R. Principles of Voice Production (Prentice Hall, 1994).

  19. Sanders, I., & Mu, L. A three‐dimensional atlas of human tongue muscles. Anat. Rec. 296, 1102–1114 (2013).

    Article  Google Scholar 

  20. Maeda, S. in Speech Production and Speech Modelling 2(eds Hardcastle, W. J. & Marchal, A.) 63–403 (Springer, 2012).

  21. Story, B. & Titze, I. R. Parametrization of vocal tract area functions by empirical orthogonal modes. Natl. Cent. Voice Speech Status Prog. Rep. 10, 9–23 (1996).

    Google Scholar 

  22. Assaneo, M. F., Ramirez Butavand, D., Trevisan, M. A., & Mindlin, G. B. Discrete anatomical coordinates for speech production and synthesis. Front. Commun. 4, 13 (2019).

    Article  Google Scholar 

  23. Bocquelet, F., Hueber, T., Girin, L., Savariaux, C. & Yvert, B. Real-time control of an articulatory-based speech synthesizer for brain computer interfaces. PLOS Comput. Biol. 12, e1005119 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  24. Abbs, J. H., Gracco, V. L., & Cole, K. J. Control of multimovement coordination: Sensorimotor mechansims in Speech motor programming. J. Mot Behav. 16, 195–232 (1984).

    Article  CAS  PubMed  Google Scholar 

  25. Browman, C. P., & Goldstein, L. Articulatory phonology: An overview. Phonetica 49, 155–180 (1992).

    Article  CAS  PubMed  Google Scholar 

  26. Hughes, O. M., & Abbs, J. H. Labial-mandibular coordination in the production of speech: Implications for the operation of motor equivalence. Phonetica 33, 199–221 (1976).

    Article  CAS  PubMed  Google Scholar 

  27. Chartier, J., Anumanchipalli, G. K., Johnson, K., & Chang, E. F. Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex. Neuron 98, 1042–1054 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Walsh, B., & Smith, A. Articulatory movements in adolescents. J. Speech Lang. Hear. R. 45, 1119–1133 (2002).

    Article  Google Scholar 

  29. Chakraborty, R., Goffman, L., & Smith, A. Physiological indices of bilingualism: Oral–motor coordination and speech rate in Bengali–English speakers. J. Speech Lang. Hear. R. 51, 321–332 (2008).

    Article  Google Scholar 

  30. Riely, R. R., & Smith, A. Speech movements do not scale by orofacial structure size. J. Appl. Physiol. 94, 2119–2126 (2003).

    Article  PubMed  Google Scholar 

  31. Bennett, J. W., Van Lieshout, P. H., & Steele, C. M. Tongue control for speech and swallowing in healthy younger and older adults Int. J. Orofac. Myol. 33, 5–18.(2007).

    Google Scholar 

  32. Lindblad, P., Karlsson, S., & Heller, E. Mandibular movements in speech phrases — A syllabic quasiregular continuous oscillation. Logop. Phoniatr. Vocol. 16, 36–42 (1991).

    Article  Google Scholar 

  33. Ohala, J. J. The temporal regulation of speech. Auditory analysis and perception of speech, (eds. G. Fant, M. A. A. Tatham) 431–453 (Academic Press 1975).

  34. Cummins, F. Oscillators and syllables: a cautionary note. Front. Psychol. 3, 364 (2012).

    PubMed  PubMed Central  Google Scholar 

  35. Ghitza, O. The theta-syllable: a unit of speech information defined by cortical function. Front. Psychol. 4, 138 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Strauß, A., & Schwartz, J. L. The syllable in the light of motor skills and neural oscillations. Lang. Cogn. Neurosci. 32, 562–569 (2017).

    Article  Google Scholar 

  37. Mehler, J. The role of syllables in speech processing: Infant and adult data. Philos. T. R. Soc. B. Biol. Sci. 295, 333–352 (1981).

    Google Scholar 

  38. Hooper, J. B. The syllable in phonological theory. Language 48, 525–540 (1972).

    Article  Google Scholar 

  39. Eimas, P. D. Segmental and syllabic representations in the perception of speech by young infants. J. Acoust. Soc. Am. 105, 1901–1911 (1999).

    Article  CAS  PubMed  Google Scholar 

  40. Liberman, I. Y., Shankweiler, D., Fischer, F. W., & Carter, B. Explicit syllable and phoneme segmentation in the young child. J. Exp. Child Psychol. 18, 201–212 (1974).

    Article  Google Scholar 

  41. Ziegler, W., Aichert, I., & Staiger, A. Syllable-and rhythm-based approaches in the treatment of apraxia of speech. Perspec. Neurophysiol. Neurogenic Speech Lang. Disord. 20, 59–66 (2010).

    Article  Google Scholar 

  42. Carreiras, M., & Perea, M. Naming pseudowords in Spanish: Effects of syllable frequency. Brain Lang. 90, 393–400 (2004).

    Article  PubMed  Google Scholar 

  43. Cholin, J., Levelt, W. J., & Schiller, N. O. Effects of syllable frequency in speech production. Cognition 99, 205–235 (2006).

    Article  PubMed  Google Scholar 

  44. Guenther, F. H., Ghosh, S. S. & Tourville, J. A. Neural modeling and imaging of the cortical interactions underlying syllable production. Brain Lang. 96, 280–301 (2006). This paper uses neuroimaging data and computational modelling to highlight the complex steps and brain regions implicated in syllable acquisition and production.

    Article  PubMed  Google Scholar 

  45. Jessen, M. Forensic reference data on articulation rate in German. Sci. Justice 47, 50–67 (2007).

    Article  PubMed  Google Scholar 

  46. Fosler-Lussier, E., & Morgan, N. Effects of speaking rate and word frequency on pronunciations in convertional speech. Speech Commun. 29, 137–158 (1999).

    Article  Google Scholar 

  47. Pellegrino, F., Coupé, C. & Marsico, E. Across-language perspective on speech information rate. Language 87, 539–558 (2011).

    Article  Google Scholar 

  48. Jacewicz, E., Fox, R. A., O’Neill, C., & Salmons, J.Articulation rate across dialect, age, and gender. Lang. Var. Change 21, 233–256 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Künzel, H. J. Some general phonetic and forensic aspects of speaking tempo. Int. J. Speech Lang. Law 4, 48–83 (1997).

    Article  Google Scholar 

  50. Ramig, L. A., & Ringel, R. L. Effects of physiological aging on selected acoustic characteristics of voice. J. Speech Lang. Hear. R. 26, 22–30 (1983).

    Article  CAS  Google Scholar 

  51. Clopper, C. G., & Smiljanic, R. Effects of gender and regional dialect on prosodic patterns in American English. J. Phon. 39, 237–245 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  52. He, L., & Dellwo, V. Amplitude envelope kinematics of speech: Parameter extraction and applications. J. Acoust. Soc. Am. 141, 3582–3582 (2017).

    Article  Google Scholar 

  53. Mermelstein, P. Automatic segmentation of speech into syllabic units. J. Acoust. Soc. Am. 58, 880–883 (1975).

    Article  CAS  PubMed  Google Scholar 

  54. Tilsen, S., & Arvaniti, A. Speech rhythm analysis with decomposition of the amplitude envelope: characterizing rhythmic patterns within and across languages. J. Acoust. Soc. Am. 134, 628–639 (2013).

    Article  PubMed  Google Scholar 

  55. Titze, I. R. Measurements for voice production: research and clinical applications. J. Acoust. Soc. Am. 104, 1148 (1998).

    Article  Google Scholar 

  56. Amador, A., Perl, Y. S., Mindlin, G. B. & Margoliash, D. Elemental gesture dynamics are encoded by song premotor cortical neurons. Nature 495, 59–64 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Norton, P., & Scharff, C. “Bird song Metronomics”: isochronous organization of zebra finch song rhythm. Front. Neurosci. 10, 309 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Ahissar, E. et al. Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc. Natl Acad. Sci. USA 98, 13367–13372 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Luo, H. & Poeppel, D. Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron 54, 1001–1010 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Henry, M. J., & Obleser, J. Frequency modulation entrains slow neural oscillations and optimizes human listening behavior. Proc. Natl Acad. Sci. USA 109, 20095–20100 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Lakatos, P., Gross, J., & Thut, G. A new unifying account of the roles of neuronal entrainment. Curr. Biol. 29, R890–R905 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Peña, M., & Melloni, L. Brain oscillations during spoken sentence processing. J. Cogn. Neurosci. 24, 1149–1164 (2012).

    Article  PubMed  Google Scholar 

  63. Howard, M. F., & Poeppel, D. Discrimination of speech stimuli based on neuronal response phase patterns depends on acoustics but not comprehension. J. Neurophysiol. 104, 2500–2511 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Golumbic, E. M. Z., et al. Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party”. Neuron 77, 980–991 (2013).

    Article  CAS  Google Scholar 

  65. Ding, N., & Simon, J. Z. Emergence of neural encoding of auditory objects while listening to competing speakers. Proc. Natl Acad. Sci. USA 109, 11854–11859 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Broderick, M. P., Anderson, A. J., & Lalor, E. C. Semantic context enhances the early auditory encoding of natural speech. J. Neurosci. 39, 7564–7575 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Assaneo, M. F. et al. The lateralization of speech–brain coupling is differentially modulated by intrinsic auditory and top-down mechanisms. Front. Integr. Neurosci. 13, 28 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Peelle, J. E., & Davis, M. H. Neural oscillations carry speech rhythm through to comprehension. Front. Psychol. 3, 320 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  69. Capilla, A., Pazo-Alvarez, P., Darriba, A., Campo, P., & Gross, J. Steady-state visual evoked potentials can be explained by temporal superposition of transient event-related responses. PLOS ONE 6, e0014543 (2011).

    Article  CAS  Google Scholar 

  70. Doelling, K. B., Assaneo, M. F., Bevilacqua, D., Pesaran, B., & Poeppel, D. An oscillator model better predicts cortical entrainment to music. Proc. Natl Acad. Sci. USA 116, 10113–10121 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Giraud, A. L., & Poeppel, D. Cortical oscillations and speech processing: emerging computational principles and operations. Nat. Neurosci. 15, 511–517 (2012). This article provides a perspective on how oscillatory neural activity may form the basis of segmenting speech to create units appropriate for cortical processing.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Gross, J. et al. Speech rhythms and multiplexed oscillatory sensory coding in the human brain. PLOS Biol. 11, e1001752 (2013). This work presents neurophysiological data revealing a nested hierarchy of entrained cortical oscillations underlying the segmentation and coding of spoken language.

    Article  PubMed  PubMed Central  Google Scholar 

  73. Peelle, J. E., Gross, J. & Davis, M. H. Phase-locked responses to speech in human auditory cortex are enhanced during comprehension. Cereb. Cortex 23, 1378–1387 (2013).

    Article  PubMed  Google Scholar 

  74. Doelling, K. B., Arnal, L. H., Ghitza, O. & Poeppel, D. Acoustic landmarks drive delta–theta oscillations to enable speech comprehension by facilitating perceptual parsing. Neuroimage 85, 761–768 (2014).

    Article  CAS  PubMed  Google Scholar 

  75. Abrams, D. A., Nicol, T., Zecker, S., & Kraus, N. Abnormal cortical processing of the syllable rate of speech in poor readers. J. Neurosci. 29, 7686–7693 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Cutini, S., Szűcs, D., Mead, N., Huss, M., & Goswami, U. Atypical right hemisphere response to slow temporal modulations in children with developmental dyslexia. Neuroimage 143, 40–49 (2016).

    Article  PubMed  Google Scholar 

  77. Wilsch, A., Neuling, T., Obleser, J., & Herrmann, C. S. Transcranial alternating current stimulation with speech envelopes modulates speech comprehension. Neuroimage 172, 766–774 (2018).

    Article  PubMed  Google Scholar 

  78. Zoefel, B., Archer-Boyd, A., & Davis, M. H. Phase entrainment of brain oscillations causally modulates neural responses to intelligible speech. Curr. Biol. 28, 401–408 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Riecke, L., Formisano, E., Sorger, B., Başkent, D., & Gaudrain, E. Neural entrainment to speech modulates speech intelligibility. Curr. Biol. 28, 161–169 (2018).

    Article  CAS  PubMed  Google Scholar 

  80. Luo, H., Wang, Y., Poeppel, D., & Simon, J. Z. Concurrent encoding of frequency and amplitude modulation in human auditory cortex: Encoding transition. J. Neurophysiol. 98, 3473–3485 (2007).

    Article  PubMed  Google Scholar 

  81. Viemeister, N. F. Temporal modulation transfer functions based upon modulation thresholds. J. Acoust. Soc. Am. 66, 1364–1380 (1979).

    Article  CAS  PubMed  Google Scholar 

  82. Zwicker, E. Die Grenzen der Hörbarkeit der Amplitudenmodulation und der Frequenzmodulation eines Tones [The limits of perceptibility of the amplitude-modulation and the frequency-modulation of a tone]. Akust. Beih. 2 (Suppl. 3), 125–133 (1952).

    Google Scholar 

  83. Giraud, A. L. et al. Representation of the temporal envelope of sounds in the human brain. J. Neurophysiol. 84, 1588–1598 (2000).

    Article  CAS  PubMed  Google Scholar 

  84. Boemio, A., Fromm, S., Braun, A. & Poeppel, D. Hierarchical and asymmetric temporal sensitivity in human auditory cortices. Nat. Neurosci. 8, 389–395 (2005).

    Article  CAS  PubMed  Google Scholar 

  85. Teng, X., Tian, X., Rowland, J., & Poeppel, D. Concurrent temporal channels for auditory processing: Oscillatory neural entrainment reveals segregation of function at different scales. PLOS Biol. 15, e2000812 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  86. Liégeois-Chauvel, C., Lorenzi, C., Trébuchon, A., Régis, J., & Chauvel, P. Temporal envelope processing in the human left and right auditory cortices. Cereb. Cortex 14, 731–740 (2004).

    Article  PubMed  Google Scholar 

  87. Overath, T., Zhang, Y., Sanes, D. H. & Poeppel, D. Sensitivity to temporal modulation rate and spectral bandwidth in the human auditory system: fMRI evidence. J. Neurophysiol. 107, 2042–2056 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  88. Versfeld, N. J., & Dreschler, W. A. The relationship between the intelligibility of time-compressed speech and speech in noise in young and elderly listeners. J. Acoust. Soc. Am. 111, 401–408 (2002).

    Article  PubMed  Google Scholar 

  89. Trouvain, J. On the comprehension of extremely fast synthetic speech. Saarl. Work. Pap. Linguist. 1, 5–13 (2007).

    Google Scholar 

  90. Ghitza, O., & Greenberg, S. On the possible role of brain rhythms in speech perception: intelligibility of time-compressed speech with periodic and aperiodic insertions of silence. Phonetica 66, 113–126 (2009). This paper presents an innovative behavioural design using speech compression that highlights the relevance of syllable-sized units for intelligibility.

    Article  PubMed  Google Scholar 

  91. Wilson, S. M., Saygin, A. P., Sereno, M. I. & Iacoboni, M. Listening to speech activates motor areas involved in speech production. Nat. Neurosci. 7, 701–702 (2004).

    Article  CAS  PubMed  Google Scholar 

  92. D’Ausilio, A. et al. The motor somatotopy of speech perception. Curr. Biol. 19, 381–385 (2009).

    Article  PubMed  CAS  Google Scholar 

  93. Du, Y., Buchsbaum, B. R., Grady, C. L. & Alain, C. Noise differentially impacts phoneme representations in the auditory and speech motor systems. Proc. Natl Acad. Sci. USA 111, 7126–7131 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Houde, J. F. & Jordan, M. I. Sensorimotor adaptation in speech production. Science 279, 1213–1216 (1998).

    Article  CAS  PubMed  Google Scholar 

  95. Black, J. W. The effect of delayed side-tone upon vocal rate and intensity. J. Speech Disord. 16, 56–60 (1951). This study is a first to demonstrate that delayed auditory feedback compromises and slows down speech production.

    Article  CAS  PubMed  Google Scholar 

  96. Flinker, A. et al. Single-trial speech suppression of auditory cortex activity in humans. J. Neurosci. 30, 16643–16650 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Tian, X., & Poeppel, D. The effect of imagination on stimulation: the functional specificity of efference copies in speech processing. J. Cogn. Neurosci. 25, 1020–1036 (2013).

    Article  PubMed  Google Scholar 

  98. Park, H., Ince, R. A. A., Schyns, P. G., Thut, G. & Gross, J. Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners. Curr. Biol. 25, 1649–1653 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Onojima, T., Kitajo, K., & Mizuhara, H. Ongoing slow oscillatory phase modulates speech intelligibility in cooperation with motor cortical activity. PLOS ONE 12, e0183146 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  100. Rimmele, J. M., Sun, Y., Michalareas, G., Ghitza, O. & Poeppel, D. Dynamics of functional networks for syllable and word-level processing. BioRxiv https://doi.org/10.1101/584375 (2019).

    Article  Google Scholar 

  101. Cope, T. E. et al. Evidence for causal top-down frontal contributions to predictive processes in speech perception. Nat. Commun. 8, 1–16 (2017).

    Article  CAS  Google Scholar 

  102. Kovelman, I. et al. Brain basis of phonological awareness for spoken language in children and its disruption in dyslexia. Cereb. Cortex 22, 754–764 (2012).

    Article  PubMed  Google Scholar 

  103. Molinaro, N., Lizarazu, M., Lallier, M., Bourguignon, M., & Carreiras, M. Out-of-synchrony speech entrainment in developmental dyslexia. Hum. Brain Mapp. 37, 2767–2783 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  104. Keitel, A., Gross, J., & Kayser, C. Perceptually relevant speech tracking in auditory and motor cortex reflects distinct linguistic features. PLOS Biol. 16, e2004473 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  105. Rimmele, J. M., Morillon, B., Poeppel, D., & Arnal, L. H. Proactive sensing of periodic and aperiodic auditory patterns. Trends Cogn. Sci. 22, 870–882 (2018).

    Article  PubMed  Google Scholar 

  106. Assaneo, M. F. & Poeppel, D. The coupling between auditory and motor cortices is rate-restricted: evidence for an intrinsic speech–motor rhythm. Sci. Adv. 4, eaao3842 (2018). This study uses neural data and modelling to show how the auditory and speech–motor systems are coupled in phase most strongly at a time scale corresponding roughly to syllable duration.

    Article  PubMed  PubMed Central  Google Scholar 

  107. Hoppensteadt, F. C. & Izhikevich, E. M. Weakly Connected Neural Networks (Springer, 1997).

  108. Giraud, A. L. et al. Endogenous cortical rhythms determine cerebral specialization for speech perception and production. Neuron 56, 1127–1134 (2007).

    Article  CAS  PubMed  Google Scholar 

  109. Keitel, A., & Gross, J. Individual human brain areas can be identified from their characteristic spectral activation fingerprints. PLOS Biol. 14, e1002498 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  110. Lee, B. S. Effects of delayed speech feedback. J. Acoust. Soc. Am. 22, 824–826 (1950).

    Article  Google Scholar 

  111. Assaneo, M. F. et al. Spontaneous synchronization to speech reveals neural mechanisms facilitating language learning. Nature Neurosci. 22, 627–632 (2019). This study uses an uncomplicated behavioural speech synchronization test to show how subjects differ anatomically and physiologically in their ability to align their sensorimotor systems.

    Article  CAS  PubMed  Google Scholar 

  112. Stuart, A., Kalinowski, J., Rastatter, M. P. & Lynch, K. Effect of delayed auditory feedback on normal speakers at two speech rates. J. Acoust. Soc. Am. 111, 2237 (2002).

    Article  PubMed  Google Scholar 

  113. Saffran, J. R., Aslin, R. N. & Newport, E. L. Statistical learning by 8-month-old infants. Science 274, 1926–1928 (1996).

    Article  CAS  PubMed  Google Scholar 

  114. Hickok, G. & Poeppel, D. The cortical organization of speech processing. Nat. Rev. Neurosci. 8, 393–402 (2007).

    Article  CAS  PubMed  Google Scholar 

  115. Magrassi, L., Aromataris, G., Cabrini, A., Annovazzi-Lodi, V. & Moro, A. Sound representation in higher language areas during language generation. Proc. Natl Acad. Sci. USA 112, 1868–1873 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  116. Long, M. A. et al. Functional segregation of cortical regions underlying speech timing and articulation. Neuron 89, 1187–1193 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Wilson, H. R., & Cowan, J. D. Excitatory and inhibitory interactions in localized populations of model neurons. Biophys. J. 12, 1–24 (1972).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  118. Buzsáki, G., Logothetis, N., & Singer, W. Scaling brain size, keeping timing: evolutionary preservation of brain rhythms. Neuron 80, 751–764 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  119. Laje, R. & Mindlin, G. B. The Physics of Birdsong (Springer-Verlag, 2005).

  120. MacNeilage, P. F. The frame/content theory of evolution of speech production. Behav. Brain Sci. 21, 499–511 (1998). This paper describes an influential theory on how evolution privileged syllables as the basic units of spoken language.

    Article  CAS  PubMed  Google Scholar 

  121. De Boysson-Bardies, B., Bacri, N., Sagart, L., & Poizat, M. Timing in late babbling. J. Child Lang. 8, 525–539 (1981).

    Article  PubMed  Google Scholar 

  122. Ghazanfar, A. A., Takahashi, D. Y., Mathur, N., & Fitch, W. T. Cineradiography of monkey lip-smacking reveals putative precursors of speech dynamics. Curr. Biol. 22, 1176–1182 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  123. Brooks, J. X., & Cullen, K. Predictive sensing: The role of motor signals in sensory processing. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 4, 842–850 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  124. Schroeder, C. E., Wilson, D. A., Radman, T., Scharfman, H., & Lakatos, P. Dynamics of active sensing and perceptual selection. Curr. Opin. Neurobiol. 20, 172–176 (2010). This article advances the perspective that motor systems can play an integral role in shaping perceptual processes by sampling the input.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  125. Wesson, D. W., Verhagen, J. V., & Wachowiak, M. Why sniff fast? The relationship between sniff frequency, odor discrimination, and receptor neuron activation in the rat. J. Neurophysiol. 101, 1089–1102 (2009).

    Article  PubMed  Google Scholar 

  126. Huston, S. J., Stopfer, M., Cassenaer, S., Aldworth, Z. N., & Laurent, G. Neural encoding of odors during active sampling and in turbulent plumes. Neuron 88, 403–418 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  127. Lederman, S. J. Tactual roughness perception: spatial and temporal determinants. Can. J. Psychol. 37, 498 (1983).

    Article  Google Scholar 

  128. Deschênes, M., Moore, J., & Kleinfeld, D. Sniffing and whisking in rodents. Curr. Opin. Neurobiol. 22, 243–250 (2012).

    Article  PubMed  CAS  Google Scholar 

  129. Fiebelkorn, I. C., & Kastner, S. A rhythmic theory of attention. Trends Cogn. Sci. 23, 87–101 (2019).

    Article  PubMed  Google Scholar 

  130. Gagl, B. et al. Reading at the speed of speech: the rate of eye movements aligns with auditory language processing. bioRxiv https://doi.org/10.1101/391896 (2018).

    Article  Google Scholar 

  131. Tierney, A., & Kraus, N. Auditory-motor entrainment and phonological skills: precise auditory timing hypothesis (PATH). Front. Hum. Neurosci. 8, 949 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  132. Wrench, A. MOCHA-TIMIT database (CSTR, Univ. of Edinburgh, 1999).

  133. Indefrey, P. & Levelt, W. J. M. in The New Cognitive Neurosciences (ed. Gazzaniga, M. S.) 845–866 (MIT Press, 2000).

  134. Bouchard, K. E., Mesgarani, N., Johnson, K. & Chang, E. F. Functional organization of human sensorimotor cortex for speech articulation. Nature 495, 327–332 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  135. Tremblay, P., & Small, S. L. Motor response selection in overt sentence production: a functional MRI study. Front. Psychol. 2, 253 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  136. Hickok, G., Buchsbaum, B., Humphries, C. & Muftuler, T. Auditory–motor interaction revealed by fMRI: speech, music, and working memory in area Spt. J. Cogn. Neurosci. 15, 673–682 (2003).

    Article  PubMed  Google Scholar 

  137. Brennan, J., & Pylkkänen, L. The time-course and spatial distribution of brain activity associated with sentence processing. Neuroimage 60, 1139–1148 (2012).

    Article  PubMed  Google Scholar 

  138. Lau, E. F., Phillips, C., & Poeppel, D. A cortical network for semantics:(de) constructing the N400. Nat. Rev. Neurosci. 9, 920–933 (2008).

    Article  CAS  PubMed  Google Scholar 

  139. Catani, M., & De Schotten, M. T. A diffusion tensor imaging tractography atlas for virtual in vivo dissections. Cortex 44, 1105–1132 (2008).

    Article  PubMed  Google Scholar 

  140. Liberman, A. M., & Mattingly, I. G. The motor theory of speech perception revised. Cognition 21, 1–36 (1985).

    Article  CAS  PubMed  Google Scholar 

  141. Lotto, A. J., Hickok, G. S. & Holt, L. L. Reflections on mirror neurons and speech perception. Trends Cogn. Sci. 13, 110–114 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  142. Skipper, J. I., Devlin, J. T., & Lametti, D. R. The hearing ear is always found close to the speaking tongue: Review of the role of the motor system in speech perception. Brain Lang. 164, 77–105 (2017).

    Article  PubMed  Google Scholar 

  143. Lane, H. The motor theory of speech perception: A critical review. Psychol. Rev. 72, 275 (1965).

    Article  CAS  PubMed  Google Scholar 

  144. Rauschecker, J. P. & Scott, S. K. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat. Neurosci. 12, 718–724 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  145. Friederici, A. D. Pathways to language: fiber tracts in the human brain. Trends Cogn. Sci. 13, 175–181 (2009).

    Article  PubMed  Google Scholar 

  146. Dick, A. S., Bernal, B., & Tremblay, P. The language connectome: new pathways, new concepts. Neuroscientist 20, 453–467 (2014).

    Article  PubMed  Google Scholar 

  147. Saur, D. et al. Ventral and dorsal pathways for language. Proc. Natl Acad. Sci. USA 105, 18035–18040 (2008). This study presents some of the first anatomical data to demonstrate that there are distinct ventral and dorsal pathways underpinning language processing.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  148. Wong, F. C., Chandrasekaran, B., Garibaldi, K., & Wong, P. C. White matter anisotropy in the ventral language pathway predicts sound-to-word learning success. J. Neurosci. 31, 8780–8785 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  149. Brauer, J., Anwander, A., Perani, D., & Friederici, A. D. Dorsal and ventral pathways in language development. Brain Lang. 127, 289–295 (2013).

    Article  PubMed  Google Scholar 

  150. Catani, M., & De Schotten, M. T. A diffusion tensor imaging tractography atlas for virtual in vivo dissections. Cortex 44, 1105–1132 (2008).

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

The authors thank O. Ghitza and J. Orpella for valuable feedback. They acknowledge the support of the Max Planck Society and NIH R01DC05660.

Author information

Authors and Affiliations

Authors

Contributions

Both authors contributed equally to all aspects of the manuscript.

Corresponding author

Correspondence to David Poeppel.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer reviewer information

Nature Reviews Neuroscience thanks J. Gross, G. Mindlin and the other anonymous reviewer for their contribution to the peer review of this work.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Glossary

Distinctive features

Stable auditory and/or articulatory patterns that distinguish phonemes, for example ‘voicing’ in /b/ versus /p/.

Phones

Brief segments of speech that have characteristic physical or perceptual attributes.

Phonemes

The speech elements of a language (vowels and consonants) that encode words.

Segmentation

The process of chunking the continuous acoustic stream of spoken language into units.

Decoding

Mapping the segmented acoustic chunks into linguistic units (phonemes, syllables or words) stored in the mental dictionary.

Audio-motor integration

The alignment or merging of information computed in the auditory and (speech) motor systems.

Spectrogram

A visualization of how the frequency composition of a signal evolves over time.

Critical band filtering

Decomposing a signal into different frequency bands defined according to the frequency response of the relevant biophysical system.

1/f noise spectrum

The power spectrum of noise decreases with frequency, an attribute of many biological signals.

Spectrum

A representation of how much energy a signal carries in each frequency band.

Vocal tract

The set of anatomical cavities above the larynx that shape the production of speech.

Velum

Part of the roof of the oral cavity comprising connective tissue and muscle, also called the soft palate.

Syllable

A basic unit of spoken language, typically comprising a vowel (energy peak) with adjoining consonants (for example, /bar/), and thus a short sequence of speech sounds.

Entrainment

The synchronization of brain activity to the temporal structure of a stimulus or between the activity of neural elements.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Poeppel, D., Assaneo, M.F. Speech rhythms and their neural foundations. Nat Rev Neurosci 21, 322–334 (2020). https://doi.org/10.1038/s41583-020-0304-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41583-020-0304-4

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing