During conversation, people take turns speaking by rapidly responding to their partners while simultaneously avoiding interruption1,2. Such interactions display a remarkable degree of coordination, as gaps between turns are typically about 200 milliseconds3—approximately the duration of an eyeblink4. These latencies are considerably shorter than those observed in simple word-production tasks, which indicates that speakers often plan their responses while listening to their partners2. Although a distributed network of brain regions has been implicated in speech planning5,6,7,8,9, the neural dynamics underlying the specific preparatory processes that enable rapid turn-taking are poorly understood. Here we use intracranial electrocorticography to precisely measure neural activity as participants perform interactive tasks, and we observe a functionally and anatomically distinct class of planning-related cortical dynamics. We localize these responses to a frontotemporal circuit centred on the language-critical caudal inferior frontal cortex10 (Broca’s region) and the caudal middle frontal gyrus—a region not normally implicated in speech planning11,12,13. Using a series of motor tasks, we then show that this planning network is more active when preparing speech as opposed to non-linguistic actions. Finally, we delineate planning-related circuitry during natural conversation that is nearly identical to the network mapped with our interactive tasks, and we find this circuit to be most active before participant speech during unconstrained turn-taking. Therefore, we have identified a speech planning network that is central to natural language generation during social interaction.
This is a preview of subscription content, access via your institution
Subscribe to Nature+
Get immediate online access to Nature and 55 other Nature journal
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
The data used in these analyses are not publicly available owing to concerns regarding patient privacy; however, the corresponding author will provide deidentified primary data upon request.
The corresponding author will provide the MATLAB code used in this study for analysis of ECoG and behavioural data upon request.
Sacks, H., Schegloff, E. A. & Jefferson, G. A simplest systematics for the organization of turn-taking for conversation. Language 50, 696–735 (1974).
Levinson, S. C. & Torreira, F. Timing in turn-taking and its implications for processing models of language. Front. Psychol. 6, 731 (2015).
Stivers, T. et al. Universals and cultural variation in turn-taking in conversation. Proc. Natl Acad. Sci. USA 106, 10587–10592 (2009).
Schiffman, H. R. Sensation and Perception: An Integrated Approach (Wiley, 2001).
Flinker, A. et al. Redefining the role of Broca’s area in speech. Proc. Natl Acad. Sci. USA 112, 2871–2875 (2015).
Basilakos, A., Smith, K. G., Fillmore, P., Fridriksson, J. & Fedorenko, E. Functional characterization of the human speech articulation network. Cereb. Cortex 28, 1816–1830 (2018).
Mirman, D., Kraft, A. E., Harvey, D. Y., Brecher, A. R. & Schwartz, M. F. Mapping articulatory and grammatical subcomponents of fluency deficits in post-stroke aphasia. Cogn. Affect. Behav. Neurosci. 19, 1286–1298 (2019).
Guenther, F. H. Neural Control of Speech (MIT, 2016).
Sahin, N. T., Pinker, S., Cash, S. S., Schomer, D. & Halgren, E. Sequential processing of lexical, grammatical, and phonological information within Broca’s area. Science 326, 445–449 (2009).
Broca, P. Remarques sur le siege de la faculté du langage articulé, suivies d’une observation d’aphémie (perte de la parole). Bull. Mem. Soc. Anat. Paris 36, 330–356 (1861).
Chang, E. F. et al. Pure apraxia of speech after resection based in the posterior middle frontal gyrus. Neurosurgery 87, E383–E389 (2020).
Brass, M. & von Cramon, D. Y. The role of the frontal cortex in task preparation. Cereb. Cortex 12, 908–914 (2002).
Sierpowska, J. et al. Involvement of the middle frontal gyrus in language switching as revealed by electrical stimulation mapping and functional magnetic resonance imaging in bilingual brain tumor patients. Cortex 99, 78–92 (2018).
Levinson, S. C. Turn-taking in human communication-origins and implications for language processing. Trends Cogn. Sci. 20, 6–14 (2016).
Indefrey, P. The spatial and temporal signatures of word production components: a critical update. Front. Psychol. 2, 255 (2011).
Schuhmann, T., Schiller, N. O., Goebel, R. & Sack, A. T. The temporal characteristics of functional activation in Broca’s area during overt picture naming. Cortex 45, 1111–1116 (2009).
Ferpozzi, V. et al. Broca’s area as a pre-articulatory phonetic encoder: gating the motor program. Front. Hum. Neurosci. 12, 64 (2018).
Alario, F. X., Chainay, H., Lehericy, S. & Cohen, L. The role of the supplementary motor area (SMA) in word production. Brain Res. 1076, 129–143 (2006).
Ramanarayanan, V., Goldstein, L., Byrd, D. & Narayanan, S. S. An investigation of articulatory setting using real-time magnetic resonance imaging. J. Acoust. Soc. Am. 134, 510–519 (2013).
Bogels, S., Magyari, L. & Levinson, S. C. Neural signatures of response planning occur midway through an incoming question in conversation. Sci Rep. 5, 12881 (2015).
Ferreira, F. & Swets, B. How incremental is language production? Evidence from the production of utterances requiring the computation of arithmetic sums. J. Mem. Lang. 46, 57–84 (2002).
Wagner, V., Jescheniak, J. D. & Schriefers, H. On the flexibility of grammatical advance planning during sentence production: effects of cognitive load on multiple lexical access. J. Exp. Psychol. Learn. Mem. Cogn. 36, 423–440 (2010).
Dubey, A. & Ray, S. Cortical electrocorticogram (ECoG) is a local signal. J. Neurosci. 39, 4299–4311 (2019).
Cheung, C., Hamiton, L. S., Johnson, K. & Chang, E. F. The auditory representation of speech sounds in human motor cortex. eLife 5, e12577 (2016).
Glanz Iljina, O. et al. Real-life speech production and perception have a shared premotor-cortical substrate. Sci. Rep. 8, 8898 (2018).
Cisek, P. & Kalaska, J. F. Neural mechanisms for interacting with a world full of action choices. Annu. Rev. Neurosci. 33, 269–298 (2010).
Ray, S. & Maunsell, J. H. Different origins of gamma rhythm and high-gamma activity in macaque visual cortex. PLoS Biol. 9, e1000610 (2011).
Flinker, A., Chang, E. F., Barbaro, N. M., Berger, M. S. & Knight, R. T. Sub-centimeter language organization in the human temporal lobe. Brain Lang. 117, 103–109 (2011).
Bouchard, K. E., Mesgarani, N., Johnson, K. & Chang, E. F. Functional organization of human sensorimotor cortex for speech articulation. Nature 495, 327–332 (2013).
Cogan, G. B. et al. Sensory-motor transformations for speech occur bilaterally. Nature 507, 94–98 (2014).
Kotz, S. A. et al. Lexicality drives audio-motor transformations in Broca’s area. Brain Lang. 112, 3–11 (2010).
Fadiga, L. & Craighero, L. Hand actions and speech representation in Broca’s area. Cortex 42, 486–490 (2006).
Knudsen, B., Creemers, A. & Meyer, A. S. Forgotten little words: how backchannels and particles may facilitate speech planning in conversation? Front. Psychol. 11, 593671 (2020).
Long, M. A. et al. Functional segregation of cortical regions underlying speech timing and articulation. Neuron 89, 1187–1193 (2016).
Tate, M. C., Herbet, G., Moritz-Gasser, S., Tate, J. E. & Duffau, H. Probabilistic map of critical functional regions of the human cerebral cortex: Broca’s area revisited. Brain 137, 2773–2782 (2014).
Long, M. A. & Fee, M. S. Using temperature to analyse temporal dynamics in the songbird motor pathway. Nature 456, 189–194 (2008).
Okobi, D. E., Jr, Banerjee, A., Matheson, A. M. M., Phelps, S. M. & Long, M. A. Motor cortical control of vocal interaction in neotropical singing mice. Science 363, 983–988 (2019).
Tremblay, P. & Dick, A. S. Broca and Wernicke are dead, or moving past the classic model of language neurobiology. Brain Lang. 162, 60–71 (2016).
Hosman, T. et al. Auditory cues reveal intended movement information in middle frontal gyrus neuronal ensemble activity of a person with tetraplegia. Sci Rep. 11, 98 (2021).
Catani, M. et al. Short frontal lobe connections of the human brain. Cortex 48, 273–291 (2012).
Glasser, M. F. et al. A multi-modal parcellation of human cerebral cortex. Nature 536, 171–178 (2016).
Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).
Deger, K. & Ziegler, W. Speech motor programming in apraxia of speech. J. Phon. 30, 321–335 (2002).
Jackson, E. S. et al. A fNIRS investigation of speech planning and execution in adults who stutter. Neuroscience 406, 73–85 (2019).
Bogels, S., Casillas, M. & Levinson, S. C. Planning versus comprehension in turn-taking: fast responders show reduced anticipatory processing of the question. Neuropsychologia 109, 295–310 (2018).
Dale, A. M., Fischl, B. & Sereno, M. I. Cortical surface-based analysis. I. Segmentation and surface reconstruction. Neuroimage 9, 179–194 (1999).
Fischl, B. et al. Automatically parcellating the human cerebral cortex. Cereb. Cortex 14, 11–22 (2004).
Klein, A. & Tourville, J. 101 labeled brain images and a consistent human cortical labeling protocol. Front. Neurosci. 6, 171 (2012).
Desikan, R. S. et al. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage 31, 968–980 (2006).
Avants, B. B. et al. A reproducible evaluation of ANTs similarity metric performance in brain image registration. Neuroimage 54, 2033–2044 (2011).
Tyszka, J. M. & Pauli, W. M. In vivo delineation of subdivisions of the human amygdaloid complex in a high-resolution group template. Hum. Brain Mapp. 37, 3979–3998 (2016).
Kovach, C. K. & Gander, P. E. The demodulated band transform. J. Neurosci. Methods 261, 135–154 (2016).
Liu, Y., Coon, W. G., Pesters, A., de, B. P. & Schalk, G. The effects of spatial filtering and artifacts on electrocorticographic signals. J. Neural Eng. 12, 056008 (2015).
Friston, K. J. et al. Statistical parametric maps in functional imaging: a general linear approach. Hum. Brain Mapp. 2, 189–210 (1995).
Qian, T., Wu, W., Zhou, W., Gao, S. & Hong, B. in Annual International Conference of the IEEE Engineering in Medicine and Biology Society 2347–2350.
Tilsen, S. et al. Anticipatory posturing of the vocal tract reveals dissociation of speech movement plans from linguistic units. PLoS ONE 11, e0146813 (2016).
We thank A. Flinker, E. Jackson, J. Krivokapić, D. Schneider, N. Tritsch and members of the Long laboratory for comments on earlier versions of this manuscript; A. Ramirez-Cardenas, H. Chen, K. Ibayashi, H. Kawasaki, K. Nourski, H. Oya, A. Rhone and B. Snoad for help with data collection; and F. Guenther and N. Majaj for helpful conversations. This research was supported by R01 DC019354 (M.A.L.), R01 DC015260 (J.D.W.G.) and Simons Collaboration on the Global Brain (M.A.L.).
The authors declare no competing interests.
Peer review information
Nature thanks Gregory Cogan, Uri Hasson and Frederic Theunissen for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
a, Description of subprocesses assumed to occur during the perception, planning, and production windows of the CI task. b, Histograms of reaction times (RT) in early and late CI trials for all participants. c, Median RT values for early and late CI trials for all participants. d, e, Histograms depicting the distribution of average peak-to-trough response amplitudes for all electrodes displaying planning-related responses when aligned to CI onset in early and late trials (d) and different CI question types (e); median values for each distribution are indicated. Observed data (in black) are compared with a null distribution (in grey) consisting of randomly chosen timepoints (Methods). f, Schematics displaying GLM regressor structure for an early (top) and a late (bottom) variant of an example CI task question.
a, Full model R values for GLM fits of jittered high gamma activity from participant 436; each line represents data from an individual electrode. b, Example distribution of pooled D values with the fit of two Gaussians overlaid (black). The Gaussian distributions corresponding to well fit (blue) and poorly fit electrodes (red) as well as the 95th percentile of the D distribution for poorly fit electrodes (dashed line) are indicated. D values above the 95th percentile of the pooled distribution were deemed outliers (white bars) and not fitted. c, Table summarizing the number of electrodes rejected by the jittering analysis in each participant. d, Table reporting the anatomical locations of electrodes rejected by the jittering analysis and electrodes displaying significant activity in the CI task. e, Scatterplot depicting the proportion of rejected electrodes within a region as a function of the proportion of responsive electrodes in a region.
a, Scatterplot depicting the distribution of all simulated task-responsive electrodes from the continuum model in three-dimensional GLM weight space; cluster membership indicated by greyscale colour. b, c, Distribution of simulated electrodes from the continuum model displaying responses in one window (i.e., unmixed) of the CI task (b) or multiple windows (c); response class indicated by colour in b and c and unmixed electrodes denoted by small black points in c. In b, simulated unmixed electrodes located outside the cluster primarily containing electrodes of the same type (i.e., ‘misclustered’) are indicated with an ‘X’. d, e, Histograms depicting the distribution of the proportion of misclustered electrodes responsive during a single task window (i.e., unmixed electrodes) (d), and the proportion of electrodes displaying more than one significant positive weight (i.e., mixed electrodes) (e) across 100,000 iterations of the continuum model simulation. The median of each distribution as well as the values observed in the actual data (dashed line) are indicated. Gold arrows indicate the bin of each distribution containing the measurements corresponding to the example iteration depicted in panels p, r, and t of Fig. 1. f, Table reporting the number of electrodes displaying perception-related responses using either the full model or the reduced GLM lacking a planning regressor. g, h, Scatterplots depicting perception (g) and planning (h) GLM weights in the full model and reduced models lacking a planning regressor or perception regressor, respectively. Significant positive weights are denoted with filled points and nonsignificant or significant negative weights are denoted with unfilled points; the x-coordinates of each point are randomly jittered by 25% to better visualize filled versus unfilled status. No planning electrodes displayed significant perception responses in the reduced GLM lacking a planning regressor, and no perception electrodes displayed significant planning responses in the reduced GLM lacking a perception regressor.
a, Table reporting the number of perception, planning, and production-related electrodes displaying significant positive and negative weights for each GLM regressor. b, Histogram depicting mean high gamma amplitude in the first 500 ms of CI questions for all unmixed perception, planning, and production electrodes. c, d, Canonical cortical surfaces displaying electrodes with significant positive (coloured) or negative (black) GLM weights in the perception (c), production (d), and planning (e) windows of the CI task across all participants. Electrode diameter is scaled to the absolute magnitude of the GLM weight, and electrodes not displaying a significant weight for a given regressor are indicated with small white circles.
a, Cortical reconstructions for all participants displaying the location of all electrodes; the size of each electrode depicts the actual size of its recording area on the cortical surface. GLM classification is indicated by electrode colour. b, Canonical cortical surfaces showing electrode locations from all participants as standard-sized white circles. c, Number of electrodes sampling each area of the canonical cortical surface (1 cm diameter spatial smoothing) after pooling electrodes from all participants. d, Proportion of electrodes displaying significant production-related responses in the CI task (1-cm-diameter spatial smoothing). e, Canonical cortical surfaces displaying electrodes with significant responses related to speech perception, production, and planning in patients with tumour (top) and patients with epilepsy (bottom) separately; electrode diameter scaled to GLM regressor weight. Electrodes not displaying a significant response for a process are depicted as small white circles.
a, Table reporting additional turn-taking behavioural measures for each participant. b, Histograms of gap durations (time between experimenter turn offset and participant turn onset) during unconstrained conversation for each participant; bins are centred on 100 ms increments with a width of 100 ms. c, Scree plots for the PCA analysis of high gamma signals in the task (left) and conversation (right) periods of the recordings; data from each participant are represented by thin lines and the average across participants is denoted with a thick black line. The 95% confidence interval of the linear decay phase across participants (Methods) is also indicated. d, The observed number of electrodes whose cluster membership was not stable (i.e., switched clusters) between the task and conversation with a histogram depicting the distribution of electrode cluster switches expected by chance. e, The observed percentage of electrodes in perception, planning, and production clusters (in conversation-derived PC coefficient space) displaying significant perception, planning, and production responses (per the GLM), respectively, with histograms depicting the percentages expected by chance for each cluster type. f, Canonical cortical surfaces displaying the locations of all electrodes in perception, planning, and production clusters across participants (n = 6) in the task (left) and conversation (right). g, Table reporting summary statistics for PC activity (i.e., time-varying PC score) during unconstrained conversation for each participant.
a–f, For 6 participants possessing sufficient numbers of electrodes belonging to multiple GLM classes (Methods): scatterplots depicting electrode distributions in PC coefficient space in the task and conversation periods (top row). Bar graphs depicting the PC coefficients for all electrodes in perception, planning, or production clusters from the PCA performed on task data and conversation data (bottom rows). Participant number given at top of each panel. g, h, For 2 participants possessing mainly planning electrodes (Methods, Extended Data Table 1): bar graphs depicting the PC coefficients for all planning-related electrodes from the PCA performed on task data and conversation data. In the bar graphs, the functional categorization of PCs is indicated by filled bars coloured either green (perception), blue (planning), or red (production). Any clusters rejected due to a high proportion (50%) of mixed electrodes are indicated with grey filled bars.
About this article
Cite this article
Castellucci, G.A., Kovach, C.K., Howard, M.A. et al. A speech planning network for interactive language use. Nature 602, 117–122 (2022). https://doi.org/10.1038/s41586-021-04270-z