Speech rhythms and their neural foundations



The recognition of spoken language has typically been studied by focusing on either words or their constituent elements (for example, low-level features or phonemes). More recently, the ‘temporal mesoscale’ of speech has been explored, specifically regularities in the envelope of the acoustic signal that correlate with syllabic information and that play a central role in production and perception processes. The temporal structure of speech at this scale is remarkably stable across languages, with a preferred range of rhythmicity of 2– 8 Hz. Importantly, this rhythmicity is required by the processes underlying the construction of intelligible speech. A lot of current work focuses on audio-motor interactions in speech, highlighting behavioural and neural evidence that demonstrates how properties of perceptual and motor systems, and their relation, can underlie the mesoscale speech rhythms. The data invite the hypothesis that the speech motor cortex is best modelled as a neural oscillator, a conjecture that aligns well with current proposals highlighting the fundamental role of neural oscillations in perception and cognition. The findings also show motor theories (of speech) in a different light, placing new mechanistic constraints on accounts of the action–perception interface.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Different timescale representations of a generic acoustic speech signal.
Fig. 2: Speech production exhibits rhythmicity.
Fig. 3: Cortical structures supporting spoken language and sensorimotor interaction.
Fig. 4: Speech production system can be modelled as an oscillator.


  1. 1.

    Ladefoged, P. A Course in Phonetics (Harcourt Brace, 1993).

  2. 2.

    Greenberg, S. & Ainsworth, W. A. (eds) Listening to Speech: An Auditory Perspective (Psychology Press, 2012).

  3. 3.

    Stevens, K. N. Acoustic Phonetics (MIT Press, 2000).

  4. 4.

    Mesgarani, N., Cheung, C., Johnson, K., & Chang, E. F. Phonetic feature encoding in human superior temporal gyrus. Science, 343, 1006–1010 (2014)

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  5. 5.

    Marslen-Wilson, W. D. Functional parallelism in spoken word-recognition. Cognition 25, 71–102 (1987).

    CAS  PubMed  Article  Google Scholar 

  6. 6.

    Guenther, F. H. Neural Control of Speech (MIT Press, 2016).

  7. 7.

    Levelt, W. J. M. Speaking: From Intention to Articulation (MIT Press, 1993).This foundational book describes in detail the many steps involved in spoken language production.

  8. 8.

    Greenberg, S., Carvey, H., Hitchcock, L., & Chang, S. Temporal properties of spontaneous speech — a syllable-centric perspective. J. Phonetics 31, 465–485 (2003).

    Article  Google Scholar 

  9. 9.

    Goswami, U., & Leong, V. Speech rhythm and temporal structure: converging perspectives. Lab. Phon. 4, 67–92 (2013).

    Google Scholar 

  10. 10.

    Ding, N. et al. Temporal modulations in speech and music. Neurosci. Biobehav. Rev. 81, 181–187 (2017). This study includes an analysis of several large speech and music corpora demonstrating the acoustic regular modulation rate of these basic signal types.

    PubMed  Article  Google Scholar 

  11. 11.

    Houtgast, T., & Steeneken, H. J. A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. J. Acoust. Soc. Am. 77, 1069–1077 (1985).

    Article  Google Scholar 

  12. 12.

    Varnet, L., Ortiz-Barajas, M. C., Erra, R. G., Gervain, J. & Lorenzi, C. A cross-linguistic study of speech modulation spectra. J. Acoust. Soc. Am. 142, 1976–1989 (2017). Together with Ding et al. (2017), this paper reveals that signal processing for a wide variety of languages shows the temporal regularity of continuous speech.

    PubMed  Article  Google Scholar 

  13. 13.

    Drullman, R., Festen, J. M., & Plomp, R. Effect of temporal envelope smearing on speech reception. J. Acoust. Soc. Am. 95, 1053–1064 (1994).

    CAS  PubMed  Article  Google Scholar 

  14. 14.

    Elliott, T. M., & Theunissen, F. E. The modulation transfer function for speech intelligibility. PLOS Comput. Biol. 5, e1000302 (2009).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  15. 15.

    Clarke, J., & Voss, R. 1/f noise. music and speech. Nature 258, 317–318 (1975).

    Article  Google Scholar 

  16. 16.

    Drullman, R. in Listening to Speech: An Auditory Perspective (eds Greenberg, S. & Ainsworth, W.) ch. 3 (Tailor & Francis, 2012).

  17. 17.

    Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A. & Ghazanfar, A. A. The natural statistics of audiovisual speech. PLOS Comput. Biol. 5, e1000436 (2009).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  18. 18.

    Titze, I. R. Principles of Voice Production (Prentice Hall, 1994).

  19. 19.

    Sanders, I., & Mu, L. A three‐dimensional atlas of human tongue muscles. Anat. Rec. 296, 1102–1114 (2013).

    Article  Google Scholar 

  20. 20.

    Maeda, S. in Speech Production and Speech Modelling 2(eds Hardcastle, W. J. & Marchal, A.) 63–403 (Springer, 2012).

  21. 21.

    Story, B. & Titze, I. R. Parametrization of vocal tract area functions by empirical orthogonal modes. Natl. Cent. Voice Speech Status Prog. Rep. 10, 9–23 (1996).

    Google Scholar 

  22. 22.

    Assaneo, M. F., Ramirez Butavand, D., Trevisan, M. A., & Mindlin, G. B. Discrete anatomical coordinates for speech production and synthesis. Front. Commun. 4, 13 (2019).

    Article  Google Scholar 

  23. 23.

    Bocquelet, F., Hueber, T., Girin, L., Savariaux, C. & Yvert, B. Real-time control of an articulatory-based speech synthesizer for brain computer interfaces. PLOS Comput. Biol. 12, e1005119 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  24. 24.

    Abbs, J. H., Gracco, V. L., & Cole, K. J. Control of multimovement coordination: Sensorimotor mechansims in Speech motor programming. J. Mot Behav. 16, 195–232 (1984).

    CAS  PubMed  Article  Google Scholar 

  25. 25.

    Browman, C. P., & Goldstein, L. Articulatory phonology: An overview. Phonetica 49, 155–180 (1992).

    CAS  PubMed  Article  Google Scholar 

  26. 26.

    Hughes, O. M., & Abbs, J. H. Labial-mandibular coordination in the production of speech: Implications for the operation of motor equivalence. Phonetica 33, 199–221 (1976).

    CAS  PubMed  Article  Google Scholar 

  27. 27.

    Chartier, J., Anumanchipalli, G. K., Johnson, K., & Chang, E. F. Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex. Neuron 98, 1042–1054 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. 28.

    Walsh, B., & Smith, A. Articulatory movements in adolescents. J. Speech Lang. Hear. R. 45, 1119–1133 (2002).

    Article  Google Scholar 

  29. 29.

    Chakraborty, R., Goffman, L., & Smith, A. Physiological indices of bilingualism: Oral–motor coordination and speech rate in Bengali–English speakers. J. Speech Lang. Hear. R. 51, 321–332 (2008).

    Article  Google Scholar 

  30. 30.

    Riely, R. R., & Smith, A. Speech movements do not scale by orofacial structure size. J. Appl. Physiol. 94, 2119–2126 (2003).

    PubMed  Article  Google Scholar 

  31. 31.

    Bennett, J. W., Van Lieshout, P. H., & Steele, C. M. Tongue control for speech and swallowing in healthy younger and older adults Int. J. Orofac. Myol. 33, 5–18.(2007).

    Google Scholar 

  32. 32.

    Lindblad, P., Karlsson, S., & Heller, E. Mandibular movements in speech phrases — A syllabic quasiregular continuous oscillation. Logop. Phoniatr. Vocol. 16, 36–42 (1991).

    Article  Google Scholar 

  33. 33.

    Ohala, J. J. The temporal regulation of speech. Auditory analysis and perception of speech, (eds. G. Fant, M. A. A. Tatham) 431–453 (Academic Press 1975).

  34. 34.

    Cummins, F. Oscillators and syllables: a cautionary note. Front. Psychol. 3, 364 (2012).

    PubMed  PubMed Central  Google Scholar 

  35. 35.

    Ghitza, O. The theta-syllable: a unit of speech information defined by cortical function. Front. Psychol. 4, 138 (2013).

    PubMed  PubMed Central  Article  Google Scholar 

  36. 36.

    Strauß, A., & Schwartz, J. L. The syllable in the light of motor skills and neural oscillations. Lang. Cogn. Neurosci. 32, 562–569 (2017).

    Article  Google Scholar 

  37. 37.

    Mehler, J. The role of syllables in speech processing: Infant and adult data. Philos. T. R. Soc. B. Biol. Sci. 295, 333–352 (1981).

    Google Scholar 

  38. 38.

    Hooper, J. B. The syllable in phonological theory. Language 48, 525–540 (1972).

    Article  Google Scholar 

  39. 39.

    Eimas, P. D. Segmental and syllabic representations in the perception of speech by young infants. J. Acoust. Soc. Am. 105, 1901–1911 (1999).

    CAS  PubMed  Article  Google Scholar 

  40. 40.

    Liberman, I. Y., Shankweiler, D., Fischer, F. W., & Carter, B. Explicit syllable and phoneme segmentation in the young child. J. Exp. Child Psychol. 18, 201–212 (1974).

    Article  Google Scholar 

  41. 41.

    Ziegler, W., Aichert, I., & Staiger, A. Syllable-and rhythm-based approaches in the treatment of apraxia of speech. Perspec. Neurophysiol. Neurogenic Speech Lang. Disord. 20, 59–66 (2010).

    Article  Google Scholar 

  42. 42.

    Carreiras, M., & Perea, M. Naming pseudowords in Spanish: Effects of syllable frequency. Brain Lang. 90, 393–400 (2004).

    PubMed  Article  Google Scholar 

  43. 43.

    Cholin, J., Levelt, W. J., & Schiller, N. O. Effects of syllable frequency in speech production. Cognition 99, 205–235 (2006).

    PubMed  Article  Google Scholar 

  44. 44.

    Guenther, F. H., Ghosh, S. S. & Tourville, J. A. Neural modeling and imaging of the cortical interactions underlying syllable production. Brain Lang. 96, 280–301 (2006). This paper uses neuroimaging data and computational modelling to highlight the complex steps and brain regions implicated in syllable acquisition and production.

    PubMed  Article  Google Scholar 

  45. 45.

    Jessen, M. Forensic reference data on articulation rate in German. Sci. Justice 47, 50–67 (2007).

    PubMed  Article  Google Scholar 

  46. 46.

    Fosler-Lussier, E., & Morgan, N. Effects of speaking rate and word frequency on pronunciations in convertional speech. Speech Commun. 29, 137–158 (1999).

    Article  Google Scholar 

  47. 47.

    Pellegrino, F., Coupé, C. & Marsico, E. Across-language perspective on speech information rate. Language 87, 539–558 (2011).

    Article  Google Scholar 

  48. 48.

    Jacewicz, E., Fox, R. A., O’Neill, C., & Salmons, J.Articulation rate across dialect, age, and gender. Lang. Var. Change 21, 233–256 (2009).

    PubMed  PubMed Central  Article  Google Scholar 

  49. 49.

    Künzel, H. J. Some general phonetic and forensic aspects of speaking tempo. Int. J. Speech Lang. Law 4, 48–83 (1997).

    Article  Google Scholar 

  50. 50.

    Ramig, L. A., & Ringel, R. L. Effects of physiological aging on selected acoustic characteristics of voice. J. Speech Lang. Hear. R. 26, 22–30 (1983).

    CAS  Article  Google Scholar 

  51. 51.

    Clopper, C. G., & Smiljanic, R. Effects of gender and regional dialect on prosodic patterns in American English. J. Phon. 39, 237–245 (2011).

    PubMed  PubMed Central  Article  Google Scholar 

  52. 52.

    He, L., & Dellwo, V. Amplitude envelope kinematics of speech: Parameter extraction and applications. J. Acoust. Soc. Am. 141, 3582–3582 (2017).

    Article  Google Scholar 

  53. 53.

    Mermelstein, P. Automatic segmentation of speech into syllabic units. J. Acoust. Soc. Am. 58, 880–883 (1975).

    CAS  PubMed  Article  Google Scholar 

  54. 54.

    Tilsen, S., & Arvaniti, A. Speech rhythm analysis with decomposition of the amplitude envelope: characterizing rhythmic patterns within and across languages. J. Acoust. Soc. Am. 134, 628–639 (2013).

    PubMed  Article  Google Scholar 

  55. 55.

    Titze, I. R. Measurements for voice production: research and clinical applications. J. Acoust. Soc. Am. 104, 1148 (1998).

    Article  Google Scholar 

  56. 56.

    Amador, A., Perl, Y. S., Mindlin, G. B. & Margoliash, D. Elemental gesture dynamics are encoded by song premotor cortical neurons. Nature 495, 59–64 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  57. 57.

    Norton, P., & Scharff, C. “Bird song Metronomics”: isochronous organization of zebra finch song rhythm. Front. Neurosci. 10, 309 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  58. 58.

    Ahissar, E. et al. Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc. Natl Acad. Sci. USA 98, 13367–13372 (2001).

    CAS  PubMed  Article  Google Scholar 

  59. 59.

    Luo, H. & Poeppel, D. Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron 54, 1001–1010 (2007).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  60. 60.

    Henry, M. J., & Obleser, J. Frequency modulation entrains slow neural oscillations and optimizes human listening behavior. Proc. Natl Acad. Sci. USA 109, 20095–20100 (2012).

    CAS  PubMed  Article  Google Scholar 

  61. 61.

    Lakatos, P., Gross, J., & Thut, G. A new unifying account of the roles of neuronal entrainment. Curr. Biol. 29, R890–R905 (2019).

    CAS  PubMed  Article  Google Scholar 

  62. 62.

    Peña, M., & Melloni, L. Brain oscillations during spoken sentence processing. J. Cogn. Neurosci. 24, 1149–1164 (2012).

    PubMed  Article  Google Scholar 

  63. 63.

    Howard, M. F., & Poeppel, D. Discrimination of speech stimuli based on neuronal response phase patterns depends on acoustics but not comprehension. J. Neurophysiol. 104, 2500–2511 (2010).

    PubMed  PubMed Central  Article  Google Scholar 

  64. 64.

    Golumbic, E. M. Z., et al. Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party”. Neuron 77, 980–991 (2013).

    Article  CAS  Google Scholar 

  65. 65.

    Ding, N., & Simon, J. Z. Emergence of neural encoding of auditory objects while listening to competing speakers. Proc. Natl Acad. Sci. USA 109, 11854–11859 (2012).

    CAS  PubMed  Article  Google Scholar 

  66. 66.

    Broderick, M. P., Anderson, A. J., & Lalor, E. C. Semantic context enhances the early auditory encoding of natural speech. J. Neurosci. 39, 7564–7575 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  67. 67.

    Assaneo, M. F. et al. The lateralization of speech–brain coupling is differentially modulated by intrinsic auditory and top-down mechanisms. Front. Integr. Neurosci. 13, 28 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  68. 68.

    Peelle, J. E., & Davis, M. H. Neural oscillations carry speech rhythm through to comprehension. Front. Psychol. 3, 320 (2012).

    PubMed  PubMed Central  Article  Google Scholar 

  69. 69.

    Capilla, A., Pazo-Alvarez, P., Darriba, A., Campo, P., & Gross, J. Steady-state visual evoked potentials can be explained by temporal superposition of transient event-related responses. PLOS ONE 6, e0014543 (2011).

    Article  CAS  Google Scholar 

  70. 70.

    Doelling, K. B., Assaneo, M. F., Bevilacqua, D., Pesaran, B., & Poeppel, D. An oscillator model better predicts cortical entrainment to music. Proc. Natl Acad. Sci. USA 116, 10113–10121 (2019).

    CAS  PubMed  Article  Google Scholar 

  71. 71.

    Giraud, A. L., & Poeppel, D. Cortical oscillations and speech processing: emerging computational principles and operations. Nat. Neurosci. 15, 511–517 (2012). This article provides a perspective on how oscillatory neural activity may form the basis of segmenting speech to create units appropriate for cortical processing.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  72. 72.

    Gross, J. et al. Speech rhythms and multiplexed oscillatory sensory coding in the human brain. PLOS Biol. 11, e1001752 (2013). This work presents neurophysiological data revealing a nested hierarchy of entrained cortical oscillations underlying the segmentation and coding of spoken language.

    PubMed  PubMed Central  Article  Google Scholar 

  73. 73.

    Peelle, J. E., Gross, J. & Davis, M. H. Phase-locked responses to speech in human auditory cortex are enhanced during comprehension. Cereb. Cortex 23, 1378–1387 (2013).

    PubMed  Article  Google Scholar 

  74. 74.

    Doelling, K. B., Arnal, L. H., Ghitza, O. & Poeppel, D. Acoustic landmarks drive delta–theta oscillations to enable speech comprehension by facilitating perceptual parsing. Neuroimage 85, 761–768 (2014).

    CAS  PubMed  Article  Google Scholar 

  75. 75.

    Abrams, D. A., Nicol, T., Zecker, S., & Kraus, N. Abnormal cortical processing of the syllable rate of speech in poor readers. J. Neurosci. 29, 7686–7693 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  76. 76.

    Cutini, S., Szűcs, D., Mead, N., Huss, M., & Goswami, U. Atypical right hemisphere response to slow temporal modulations in children with developmental dyslexia. Neuroimage 143, 40–49 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  77. 77.

    Wilsch, A., Neuling, T., Obleser, J., & Herrmann, C. S. Transcranial alternating current stimulation with speech envelopes modulates speech comprehension. Neuroimage 172, 766–774 (2018).

    PubMed  Article  Google Scholar 

  78. 78.

    Zoefel, B., Archer-Boyd, A., & Davis, M. H. Phase entrainment of brain oscillations causally modulates neural responses to intelligible speech. Curr. Biol. 28, 401–408 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  79. 79.

    Riecke, L., Formisano, E., Sorger, B., Başkent, D., & Gaudrain, E. Neural entrainment to speech modulates speech intelligibility. Curr. Biol. 28, 161–169 (2018).

    CAS  PubMed  Article  Google Scholar 

  80. 80.

    Luo, H., Wang, Y., Poeppel, D., & Simon, J. Z. Concurrent encoding of frequency and amplitude modulation in human auditory cortex: Encoding transition. J. Neurophysiol. 98, 3473–3485 (2007).

    PubMed  Article  Google Scholar 

  81. 81.

    Viemeister, N. F. Temporal modulation transfer functions based upon modulation thresholds. J. Acoust. Soc. Am. 66, 1364–1380 (1979).

    CAS  PubMed  Article  Google Scholar 

  82. 82.

    Zwicker, E. Die Grenzen der Hörbarkeit der Amplitudenmodulation und der Frequenzmodulation eines Tones [The limits of perceptibility of the amplitude-modulation and the frequency-modulation of a tone]. Akust. Beih. 2 (Suppl. 3), 125–133 (1952).

    Google Scholar 

  83. 83.

    Giraud, A. L. et al. Representation of the temporal envelope of sounds in the human brain. J. Neurophysiol. 84, 1588–1598 (2000).

    CAS  PubMed  Article  Google Scholar 

  84. 84.

    Boemio, A., Fromm, S., Braun, A. & Poeppel, D. Hierarchical and asymmetric temporal sensitivity in human auditory cortices. Nat. Neurosci. 8, 389–395 (2005).

    CAS  PubMed  Article  Google Scholar 

  85. 85.

    Teng, X., Tian, X., Rowland, J., & Poeppel, D. Concurrent temporal channels for auditory processing: Oscillatory neural entrainment reveals segregation of function at different scales. PLOS Biol. 15, e2000812 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  86. 86.

    Liégeois-Chauvel, C., Lorenzi, C., Trébuchon, A., Régis, J., & Chauvel, P. Temporal envelope processing in the human left and right auditory cortices. Cereb. Cortex 14, 731–740 (2004).

    PubMed  Article  Google Scholar 

  87. 87.

    Overath, T., Zhang, Y., Sanes, D. H. & Poeppel, D. Sensitivity to temporal modulation rate and spectral bandwidth in the human auditory system: fMRI evidence. J. Neurophysiol. 107, 2042–2056 (2012).

    PubMed  PubMed Central  Article  Google Scholar 

  88. 88.

    Versfeld, N. J., & Dreschler, W. A. The relationship between the intelligibility of time-compressed speech and speech in noise in young and elderly listeners. J. Acoust. Soc. Am. 111, 401–408 (2002).

    PubMed  Article  Google Scholar 

  89. 89.

    Trouvain, J. On the comprehension of extremely fast synthetic speech. Saarl. Work. Pap. Linguist. 1, 5–13 (2007).

    Google Scholar 

  90. 90.

    Ghitza, O., & Greenberg, S. On the possible role of brain rhythms in speech perception: intelligibility of time-compressed speech with periodic and aperiodic insertions of silence. Phonetica 66, 113–126 (2009). This paper presents an innovative behavioural design using speech compression that highlights the relevance of syllable-sized units for intelligibility.

    PubMed  Article  Google Scholar 

  91. 91.

    Wilson, S. M., Saygin, A. P., Sereno, M. I. & Iacoboni, M. Listening to speech activates motor areas involved in speech production. Nat. Neurosci. 7, 701–702 (2004).

    CAS  PubMed  Article  Google Scholar 

  92. 92.

    D’Ausilio, A. et al. The motor somatotopy of speech perception. Curr. Biol. 19, 381–385 (2009).

    PubMed  Article  CAS  Google Scholar 

  93. 93.

    Du, Y., Buchsbaum, B. R., Grady, C. L. & Alain, C. Noise differentially impacts phoneme representations in the auditory and speech motor systems. Proc. Natl Acad. Sci. USA 111, 7126–7131 (2014).

    CAS  PubMed  Article  Google Scholar 

  94. 94.

    Houde, J. F. & Jordan, M. I. Sensorimotor adaptation in speech production. Science 279, 1213–1216 (1998).

    CAS  PubMed  Article  Google Scholar 

  95. 95.

    Black, J. W. The effect of delayed side-tone upon vocal rate and intensity. J. Speech Disord. 16, 56–60 (1951). This study is a first to demonstrate that delayed auditory feedback compromises and slows down speech production.

    CAS  PubMed  Article  Google Scholar 

  96. 96.

    Flinker, A. et al. Single-trial speech suppression of auditory cortex activity in humans. J. Neurosci. 30, 16643–16650 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  97. 97.

    Tian, X., & Poeppel, D. The effect of imagination on stimulation: the functional specificity of efference copies in speech processing. J. Cogn. Neurosci. 25, 1020–1036 (2013).

    PubMed  Article  Google Scholar 

  98. 98.

    Park, H., Ince, R. A. A., Schyns, P. G., Thut, G. & Gross, J. Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners. Curr. Biol. 25, 1649–1653 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  99. 99.

    Onojima, T., Kitajo, K., & Mizuhara, H. Ongoing slow oscillatory phase modulates speech intelligibility in cooperation with motor cortical activity. PLOS ONE 12, e0183146 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  100. 100.

    Rimmele, J. M., Sun, Y., Michalareas, G., Ghitza, O. & Poeppel, D. Dynamics of functional networks for syllable and word-level processing. BioRxiv https://doi.org/10.1101/584375 (2019).

    Article  Google Scholar 

  101. 101.

    Cope, T. E. et al. Evidence for causal top-down frontal contributions to predictive processes in speech perception. Nat. Commun. 8, 1–16 (2017).

    CAS  Article  Google Scholar 

  102. 102.

    Kovelman, I. et al. Brain basis of phonological awareness for spoken language in children and its disruption in dyslexia. Cereb. Cortex 22, 754–764 (2012).

    PubMed  Article  Google Scholar 

  103. 103.

    Molinaro, N., Lizarazu, M., Lallier, M., Bourguignon, M., & Carreiras, M. Out-of-synchrony speech entrainment in developmental dyslexia. Hum. Brain Mapp. 37, 2767–2783 (2016).

    PubMed  Article  PubMed Central  Google Scholar 

  104. 104.

    Keitel, A., Gross, J., & Kayser, C. Perceptually relevant speech tracking in auditory and motor cortex reflects distinct linguistic features. PLOS Biol. 16, e2004473 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  105. 105.

    Rimmele, J. M., Morillon, B., Poeppel, D., & Arnal, L. H. Proactive sensing of periodic and aperiodic auditory patterns. Trends Cogn. Sci. 22, 870–882 (2018).

    PubMed  Article  Google Scholar 

  106. 106.

    Assaneo, M. F. & Poeppel, D. The coupling between auditory and motor cortices is rate-restricted: evidence for an intrinsic speech–motor rhythm. Sci. Adv. 4, eaao3842 (2018). This study uses neural data and modelling to show how the auditory and speech–motor systems are coupled in phase most strongly at a time scale corresponding roughly to syllable duration.

    PubMed  PubMed Central  Article  Google Scholar 

  107. 107.

    Hoppensteadt, F. C. & Izhikevich, E. M. Weakly Connected Neural Networks (Springer, 1997).

  108. 108.

    Giraud, A. L. et al. Endogenous cortical rhythms determine cerebral specialization for speech perception and production. Neuron 56, 1127–1134 (2007).

    CAS  PubMed  Article  Google Scholar 

  109. 109.

    Keitel, A., & Gross, J. Individual human brain areas can be identified from their characteristic spectral activation fingerprints. PLOS Biol. 14, e1002498 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  110. 110.

    Lee, B. S. Effects of delayed speech feedback. J. Acoust. Soc. Am. 22, 824–826 (1950).

    Article  Google Scholar 

  111. 111.

    Assaneo, M. F. et al. Spontaneous synchronization to speech reveals neural mechanisms facilitating language learning. Nature Neurosci. 22, 627–632 (2019). This study uses an uncomplicated behavioural speech synchronization test to show how subjects differ anatomically and physiologically in their ability to align their sensorimotor systems.

    CAS  PubMed  Article  Google Scholar 

  112. 112.

    Stuart, A., Kalinowski, J., Rastatter, M. P. & Lynch, K. Effect of delayed auditory feedback on normal speakers at two speech rates. J. Acoust. Soc. Am. 111, 2237 (2002).

    PubMed  Article  Google Scholar 

  113. 113.

    Saffran, J. R., Aslin, R. N. & Newport, E. L. Statistical learning by 8-month-old infants. Science 274, 1926–1928 (1996).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  114. 114.

    Hickok, G. & Poeppel, D. The cortical organization of speech processing. Nat. Rev. Neurosci. 8, 393–402 (2007).

    CAS  PubMed  Article  Google Scholar 

  115. 115.

    Magrassi, L., Aromataris, G., Cabrini, A., Annovazzi-Lodi, V. & Moro, A. Sound representation in higher language areas during language generation. Proc. Natl Acad. Sci. USA 112, 1868–1873 (2015).

    CAS  PubMed  Article  Google Scholar 

  116. 116.

    Long, M. A. et al. Functional segregation of cortical regions underlying speech timing and articulation. Neuron 89, 1187–1193 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  117. 117.

    Wilson, H. R., & Cowan, J. D. Excitatory and inhibitory interactions in localized populations of model neurons. Biophys. J. 12, 1–24 (1972).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  118. 118.

    Buzsáki, G., Logothetis, N., & Singer, W. Scaling brain size, keeping timing: evolutionary preservation of brain rhythms. Neuron 80, 751–764 (2013).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  119. 119.

    Laje, R. & Mindlin, G. B. The Physics of Birdsong (Springer-Verlag, 2005).

  120. 120.

    MacNeilage, P. F. The frame/content theory of evolution of speech production. Behav. Brain Sci. 21, 499–511 (1998). This paper describes an influential theory on how evolution privileged syllables as the basic units of spoken language.

    CAS  PubMed  Article  Google Scholar 

  121. 121.

    De Boysson-Bardies, B., Bacri, N., Sagart, L., & Poizat, M. Timing in late babbling. J. Child Lang. 8, 525–539 (1981).

    PubMed  Article  Google Scholar 

  122. 122.

    Ghazanfar, A. A., Takahashi, D. Y., Mathur, N., & Fitch, W. T. Cineradiography of monkey lip-smacking reveals putative precursors of speech dynamics. Curr. Biol. 22, 1176–1182 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  123. 123.

    Brooks, J. X., & Cullen, K. Predictive sensing: The role of motor signals in sensory processing. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 4, 842–850 (2019).

    PubMed  Article  Google Scholar 

  124. 124.

    Schroeder, C. E., Wilson, D. A., Radman, T., Scharfman, H., & Lakatos, P. Dynamics of active sensing and perceptual selection. Curr. Opin. Neurobiol. 20, 172–176 (2010). This article advances the perspective that motor systems can play an integral role in shaping perceptual processes by sampling the input.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  125. 125.

    Wesson, D. W., Verhagen, J. V., & Wachowiak, M. Why sniff fast? The relationship between sniff frequency, odor discrimination, and receptor neuron activation in the rat. J. Neurophysiol. 101, 1089–1102 (2009).

    PubMed  Article  Google Scholar 

  126. 126.

    Huston, S. J., Stopfer, M., Cassenaer, S., Aldworth, Z. N., & Laurent, G. Neural encoding of odors during active sampling and in turbulent plumes. Neuron 88, 403–418 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  127. 127.

    Lederman, S. J. Tactual roughness perception: spatial and temporal determinants. Can. J. Psychol. 37, 498 (1983).

    Article  Google Scholar 

  128. 128.

    Deschênes, M., Moore, J., & Kleinfeld, D. Sniffing and whisking in rodents. Curr. Opin. Neurobiol. 22, 243–250 (2012).

    PubMed  Article  CAS  Google Scholar 

  129. 129.

    Fiebelkorn, I. C., & Kastner, S. A rhythmic theory of attention. Trends Cogn. Sci. 23, 87–101 (2019).

    PubMed  Article  Google Scholar 

  130. 130.

    Gagl, B. et al. Reading at the speed of speech: the rate of eye movements aligns with auditory language processing. bioRxiv https://doi.org/10.1101/391896 (2018).

    Article  Google Scholar 

  131. 131.

    Tierney, A., & Kraus, N. Auditory-motor entrainment and phonological skills: precise auditory timing hypothesis (PATH). Front. Hum. Neurosci. 8, 949 (2014).

    PubMed  PubMed Central  Article  Google Scholar 

  132. 132.

    Wrench, A. MOCHA-TIMIT database (CSTR, Univ. of Edinburgh, 1999).

  133. 133.

    Indefrey, P. & Levelt, W. J. M. in The New Cognitive Neurosciences (ed. Gazzaniga, M. S.) 845–866 (MIT Press, 2000).

  134. 134.

    Bouchard, K. E., Mesgarani, N., Johnson, K. & Chang, E. F. Functional organization of human sensorimotor cortex for speech articulation. Nature 495, 327–332 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  135. 135.

    Tremblay, P., & Small, S. L. Motor response selection in overt sentence production: a functional MRI study. Front. Psychol. 2, 253 (2011).

    PubMed  PubMed Central  Article  Google Scholar 

  136. 136.

    Hickok, G., Buchsbaum, B., Humphries, C. & Muftuler, T. Auditory–motor interaction revealed by fMRI: speech, music, and working memory in area Spt. J. Cogn. Neurosci. 15, 673–682 (2003).

    PubMed  Article  Google Scholar 

  137. 137.

    Brennan, J., & Pylkkänen, L. The time-course and spatial distribution of brain activity associated with sentence processing. Neuroimage 60, 1139–1148 (2012).

    PubMed  Article  Google Scholar 

  138. 138.

    Lau, E. F., Phillips, C., & Poeppel, D. A cortical network for semantics:(de) constructing the N400. Nat. Rev. Neurosci. 9, 920–933 (2008).

    CAS  PubMed  Article  Google Scholar 

  139. 139.

    Catani, M., & De Schotten, M. T. A diffusion tensor imaging tractography atlas for virtual in vivo dissections. Cortex 44, 1105–1132 (2008).

    PubMed  Article  Google Scholar 

  140. 140.

    Liberman, A. M., & Mattingly, I. G. The motor theory of speech perception revised. Cognition 21, 1–36 (1985).

    CAS  PubMed  Article  Google Scholar 

  141. 141.

    Lotto, A. J., Hickok, G. S. & Holt, L. L. Reflections on mirror neurons and speech perception. Trends Cogn. Sci. 13, 110–114 (2009).

    PubMed  PubMed Central  Article  Google Scholar 

  142. 142.

    Skipper, J. I., Devlin, J. T., & Lametti, D. R. The hearing ear is always found close to the speaking tongue: Review of the role of the motor system in speech perception. Brain Lang. 164, 77–105 (2017).

    PubMed  Article  Google Scholar 

  143. 143.

    Lane, H. The motor theory of speech perception: A critical review. Psychol. Rev. 72, 275 (1965).

    CAS  PubMed  Article  Google Scholar 

  144. 144.

    Rauschecker, J. P. & Scott, S. K. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat. Neurosci. 12, 718–724 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  145. 145.

    Friederici, A. D. Pathways to language: fiber tracts in the human brain. Trends Cogn. Sci. 13, 175–181 (2009).

    PubMed  Article  Google Scholar 

  146. 146.

    Dick, A. S., Bernal, B., & Tremblay, P. The language connectome: new pathways, new concepts. Neuroscientist 20, 453–467 (2014).

    PubMed  Article  Google Scholar 

  147. 147.

    Saur, D. et al. Ventral and dorsal pathways for language. Proc. Natl Acad. Sci. USA 105, 18035–18040 (2008). This study presents some of the first anatomical data to demonstrate that there are distinct ventral and dorsal pathways underpinning language processing.

    CAS  PubMed  Article  Google Scholar 

  148. 148.

    Wong, F. C., Chandrasekaran, B., Garibaldi, K., & Wong, P. C. White matter anisotropy in the ventral language pathway predicts sound-to-word learning success. J. Neurosci. 31, 8780–8785 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  149. 149.

    Brauer, J., Anwander, A., Perani, D., & Friederici, A. D. Dorsal and ventral pathways in language development. Brain Lang. 127, 289–295 (2013).

    PubMed  Article  Google Scholar 

  150. 150.

    Catani, M., & De Schotten, M. T. A diffusion tensor imaging tractography atlas for virtual in vivo dissections. Cortex 44, 1105–1132 (2008).

    PubMed  Article  Google Scholar 

Download references


The authors thank O. Ghitza and J. Orpella for valuable feedback. They acknowledge the support of the Max Planck Society and NIH R01DC05660.

Author information




Both authors contributed equally to all aspects of the manuscript.

Corresponding author

Correspondence to David Poeppel.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer reviewer information

Nature Reviews Neuroscience thanks J. Gross, G. Mindlin and the other anonymous reviewer for their contribution to the peer review of this work.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Distinctive features

Stable auditory and/or articulatory patterns that distinguish phonemes, for example ‘voicing’ in /b/ versus /p/.


Brief segments of speech that have characteristic physical or perceptual attributes.


The speech elements of a language (vowels and consonants) that encode words.


The process of chunking the continuous acoustic stream of spoken language into units.


Mapping the segmented acoustic chunks into linguistic units (phonemes, syllables or words) stored in the mental dictionary.

Audio-motor integration

The alignment or merging of information computed in the auditory and (speech) motor systems.


A visualization of how the frequency composition of a signal evolves over time.

Critical band filtering

Decomposing a signal into different frequency bands defined according to the frequency response of the relevant biophysical system.

1/f noise spectrum

The power spectrum of noise decreases with frequency, an attribute of many biological signals.


A representation of how much energy a signal carries in each frequency band.

Vocal tract

The set of anatomical cavities above the larynx that shape the production of speech.


Part of the roof of the oral cavity comprising connective tissue and muscle, also called the soft palate.


A basic unit of spoken language, typically comprising a vowel (energy peak) with adjoining consonants (for example, /bar/), and thus a short sequence of speech sounds.


The synchronization of brain activity to the temporal structure of a stimulus or between the activity of neural elements.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Poeppel, D., Assaneo, M.F. Speech rhythms and their neural foundations. Nat Rev Neurosci 21, 322–334 (2020). https://doi.org/10.1038/s41583-020-0304-4

Download citation

Further reading