People occasionally use filler phrases or pauses, such as “uh”, “um”, or “y’know,” that interrupt the flow of a sentence and fill silent moments between ordinary (non-filler) phrases. It remains unknown which brain networks are engaged during the utterance of fillers. We addressed this question by quantifying event-related cortical high gamma activity at 70–110 Hz. During extraoperative electrocorticography recordings performed as part of the presurgical evaluation, patients with drug-resistant focal epilepsy were instructed to overtly explain, in a sentence, ‘what is in the image (subject)’, ‘doing what (verb)’, ‘where (location)’, and ‘when (time)’. Time–frequency analysis revealed that the utterance of fillers, compared to that of ordinary words, was associated with a greater magnitude of high gamma augmentation in association and visual cortex of either hemisphere. Our preliminary results raise the hypothesis that filler utterance would often occur when large-scale networks across the association and visual cortex are engaged in cognitive processing, including lexical retrieval as well as verbal working memory and visual scene scanning.
Regardless of age, gender, or native language, healthy individuals use filler phrases, also known as filled pauses, during spontaneous speech1. Frequent utterance of fillers is tightly associated with increased effort to recall or search for a relevant word2, increased anxiety3, and divided attention4. Disfluent non-native speakers compared to native ones as well as dysphasic patients compared to non-dysphasic ones more frequently utter fillers during verbal communication5,6. Practice and preparation are effective methods to reduce the rate of filler utterance during interviews or presentations because the word recall process becomes more automatic and less effortful7.
What happens in the cerebral cortex when one utters a filler? Only a small number of studies have attempted to determine the neural correlates of filler utterances. Effective study design is a consistent challenge in the field due to the unpredictable timing of naturally occurring filler phrases or pauses. In a study of six healthy adults using functional MRI (fMRI)8, participants were instructed to speak whatever came to mind when viewing Rorschach inkblot plates. The authors reported that trials accompanied by overt filler pauses, compared to those accompanied by complete silent pauses, was associated with increased hemodynamic activation in the left superior temporal gyrus. Another fMRI study characterized the spatial pattern of hemodynamic activation when participants listened to other’s speeches including fillers to determine the neural correlates of listening and not utterance of fillers9.
Measurement of event-related high gamma activity on electrocorticography (ECoG), a presurgical evaluation method for patients with drug-resistant epilepsy10, provides a unique opportunity to quantify the rapid dynamics of human perception and cognition without increasing the risk of surgical complications11. Task-related high gamma activity at 70–110 Hz is a summary measure of local cortical engagement with a temporal resolution of tens of milliseconds12. Augmentation of high gamma amplitude has been reported to be tightly associated with an increase in firing rate13, hemodynamic activation14, glucose metabolism15, and the probability of stimulation-induced functional impairment16. Conversely, attenuation of high gamma amplitude is associated with a reduced firing rate and hemodynamic deactivation17. Because of its outstanding signal fidelity, ECoG recording is suggested to be capable of accurately measuring the spatiotemporal dynamics of event-related neural modulations at a single-trial level in an individual patient18,19.
In the present study, during extraoperative ECoG recording, each participant was instructed to overtly explain the content of a given image with a sentence including the subject (e.g., A baby), verb (plays with a dog), location (at the beach), and time (during the day; Fig. 1). Due to the challenging nature of this task, all participants intermittently used fillers during sentence production. Although we did not originally employ this task to study the neural correlates of filler utterances versus ordinary phrases, it provided a rare opportunity to determine the spatiotemporal characteristics of high gamma augmentation during this ubiquitous, yet unpredictable, human behavior.
Given that filler phrases are associated with recall effort and word search2, we hypothesized that the spontaneous utterance of fillers, compared to that of non-filler words, would be associated with greater high gamma augmentation across large-scale networks of the association cortex. This hypothesis was further motivated by previous imaging studies of healthy children and adults, which reported that these regions were activated, to the largest extent, during tasks requiring the selection of optimal words among competing alternatives20,21,22,23,24,25. Furthermore, studies of patients with stroke and primary progressive aphasia suggest an association between more severe damage to the association cortex of the left hemisphere and increased rate of filler utterance due to the loss of word retrieval ability3,26,27,28,29.
We studied three native English-speaking patients (Table 1; age: 15, 16, and 17 years; 1 female), who underwent the sentence production task (Fig. 1A) during extraoperative subdural ECoG recording at Children’s Hospital of Michigan, Detroit, USA. None of these patients had massive brain malformations observable on an MRI or severe cognitive impairment defined by a verbal IQ of < 70. This study, approved by the Institutional Review Board at Wayne State University, was performed in accordance with the approved guidelines. Informed consent and assent were obtained from the guardians of patients and patients, respectively.
Acquisition of ECoG and three-dimensional magnetic resonance surface images
ECoG and MRI data acquisition methods were described elsewhere30. Platinum disk electrodes (10 mm center-to-center distance; 3 mm exposed diameter) were placed in the subdural space of the hemisphere estimated to contain the epileptogenic zone, based on collective evidence from the noninvasive presurgical evaluation31. ECoG signals were continuously acquired at a sampling rate of 1,000 Hz using the Nihon Kohden Neurofax 1100A Digital System (Nihon Kohden America Inc., Foothill Ranch, CA, USA). Channels classified as seizure onset zone, those generating interictal spike discharges, as well as those showing artifacts during the task were excluded from further analysis. This is a common procedure across ECoG studies of event-related high gamma activity and expected to improve the generalizability of the findings12,32,33,34. The number of nonepileptic channels included in the analysis ranged from 100 to 128 per patient. We created a three-dimensional surface image for each patient with electrode sites defined directly on the pial surface30. FreeSurfer scripts were used to parcellate the cortical gyri of each individual surface image (https://surfer.nmr.mgh.harvard.edu), in order to determine the anatomical label of each electrode location 35,36 (Fig. 2). All three patients had electrode coverages commonly involving the lateral frontal, parietal, and temporal regions.
Sentence production task
At the bedside during extraoperative ECoG recording, participants were instructed to freely explain, in a sentence, the content of a visual scene. Each scene was a photograph sampled from the International Affective Picture System37. Each photograph was 9 × 12 cm and presented at the center of a 19 inch LCD monitor placed 60 cm in front of the patient. Participants were instructed to include the following domains in the sentence in any order: ‘subject (e.g., A hippo)’, ‘verb (is bathing)’, ‘location (in the water)’, and ‘time (in summer)’ (Fig. 1A). Each participant was instructed to say, ‘I don’t know’, in case she/he failed to understand the content of a given scene. Each trial began with a 2.0 or 2.5 s fixation cross followed by the presentation of the photograph. The scene was presented until the patient completed their response, at which point the examiner manually started the next trial. Overt verbal responses were recorded using a WS-823 digital voice recorder (Olympus America Inc, Hauppauge, NY, USA) and synchronized with ECoG signals via a DC input to the ECoG amplifier10. The timing of the picture stimulus presentation was likewise synchronized using a photosensor attached to the corner of the LCD monitor and the ECoG amplifier via DC input.
Event classification and marking
The onset and offset of filler and ordinary phrases were identified and marked using recorded vocal sounds synchronized to the ECoG signal (Fig. 1B). Fillers were defined as an extraneous word or set of words (e.g., “uh”, “um”, “y’know”, or “well”1,38). We used Cool Edit Pro version 2 (Syntrillium Software Corp., Phoenix, AZ, USA) to aid in the manual marking of each phrase of interest39.
We determined the dynamics of high gamma modulations during filler and non-filler utterances using a method similar to what we have previously reported30. Briefly, we applied a complex demodulation method to transform ECoG signals from the time–voltage into time–frequency domain in steps of 5 Hz and 10 ms40,41. For each ECoG channel, we quantified the mean percentage change of high gamma amplitude within 70–110 Hz in 10 ms bins relative to a 400-ms reference period at 600–200 ms prior to the presentation of the photograph stimulus (Fig. 1). High gamma amplitude, time-locked to utterance onset and offset, was plotted as a function of time (Figs. 4, S1).
Statistical analysis to determine the effect of filler utterances on high gamma activity
To determine whether fillers accounted for the variance in utterance-related high gamma modulations, we employed a mixed model analysis at each electrode site of a given patient (SPSS Statistics 25, IBM Corp., Chicago, IL, USA). The dependent variable was the percent change of high gamma activity during a 300 ms utterance period. The following variables were treated as fixed effects: (1) ‘filler utterance’ (1 if an uttered phrase was a filler and 0 if a non-filler), (2) ‘onset/offset of phrase’ (1 during the 300 ms period immediately after utterance onset and 0 during the 300 ms period immediately before utterance offset), (3) trial number, and (4) phrase duration (ms). This analysis was designed to determine whether a filler was associated with increased neural activation independently of the three co-variables mentioned above. The intercept was treated as a random effect. The statistical significance threshold was set at p = 0.05. Cortical regions with preferential activation during fillers were identified as those with high gamma effects exceeding two standard deviations above or below the mean across all electrodes for the patient (Fig. 3).
Table 2 summarizes the behavioral data of given patients, including the number and duration of filler and non-filler utterance. The duration of filler utterances was shorter than that of the utterance of ordinary phrases (Table 2). Figure 3 presents the locations of electrode sites at which the filler effect was above or below two standard deviations from the mean across all electrode sites for a given patient. Ten sites in the association cortex and one in the left lingual gyrus (i.e., visual cortex) showed filler-preferential high gamma augmentation. Figure 4 shows the temporal dynamics of utterance-related high gamma activity at sites showing filler-preferential high gamma augmentation. Blue circles (N = 2) in Fig. 3 indicate the locations of sites showing filler-preferential high gamma attenuation. Table 3 summarizes the mixed model coefficients, t-scores, and confidence intervals of the filler effects at the 13 sites mentioned above. Online Supplementary Figure S1shows utterance-related high gamma augmentation at a face sensorimotor cortical site taking place commonly during filler and non-filler utterance.
Significance of filler-preferential high gamma augmentation
The present study indicated that the utterance of fillers, compared to that of ordinary ones, was associated with greater high gamma augmentation primarily in the association cortex. A plausible explanation for our ECoG observation is that filler utterances are more likely to occur while large-scale networks across the association cortex remain engaged in cognitive processing prior to motor responses (i.e., verbal articulation). This hypothesis is consistent with the generally accepted notion that filler utterances are a behavioral marker of increased effort to recall, search, or select a relevant word2. The involvement of large-scale association networks reflects the complexity of the sentence production task. To observe and fully describe a pictured scene involves integrating perceptual, working memory, motor, and cognitive functions at least including the semantic processing of the perceived image as well as lexical and phonological access in a sentence context42,43. The sentence production task requires extensive analysis of the visual scene involving multiple domains and a long duration of utterance response. Collective evidence indicates that semantic, lexical, and phonological processes are exerted by large-scale networks in the temporal, parietal, and frontal lobe association cortex with left-hemispheric dominance33,44,45. Linking each part of the description into a single sentence also requires substantial verbal working memory activation, which may further involve the association cortex of either hemisphere46,47,48. In contrast, overt production of non-filler phrases was previously reported to maximize the degree of neural activation in the primary sensorimotor cortex following the subsidence of neural activation in the left inferior frontal gyrus32,44.
One cannot rule out the possibility that our patients spontaneously used fillers as a method to communicate their intention49,50. In other words, one may subconsciously use fillers as a signal to infer that she/he still intends to speak further or show a need for time to collect thoughts. A behavioral study previously reported that the audience rated speakers using filler pauses higher in presentation skills than those using complete silent pauses51. A previous fMRI study of 16 healthy adults investigated the effect of listening to speech including fillers9; thereby, participants were instructed to listen to auditory sentences delivered via headphones carefully. This fMRI study reported that speech including fillers, compared to fluent speech, elicited greater degrees of hemodynamic activation in the superior temporal gyri as well as medial frontal regions.
The present study did not provide the causal evidence suggesting that filler utterance indeed facilitated the cognitive process. Our observation of filler-preferential neuronal activation in the association cortex does not indicate that frequent usage of fillers improves the verbal response.
Since all patients were adolescents, we cannot rule out the possibility that the reported neuronal dynamics could be specific to this phase of development.
The small sample size is a major limitation of the present study. Thus, one should consider this research as a hypothesis-generating study rather than as a definitive investigation. However, because the signal fidelity of ECoG is more than 100 times better than that of scalp EEG19, a number of studies suggest that one can evaluate task-related high gamma modulations on a per-trial basis12,18,52. Each patient uttered filler phrases only three to 21 times but more than 300 ordinary phrases during the task (Table 2). Such small numbers of filler utterance limited the statistical power in the mixed model analysis. Only seven of the 11 sites showing a positive filler effect on high gamma activity would survive the FDR correction for approximately 100 subdural electrode channels per patient (Table 3). Correction for multiple tests decreases the risk of Type I error but increases the risk of Type II error; given the exploratory nature of this analysis, we opted not to apply the FDR correction. Further studies using a larger number of patients and trials are necessary to validate or disprove the hypothesis generated in the present study. For example, analysis of ECoG signals during task-free communications may increase the chance of securing sufficient statistical power53.
In the present study, we computed the percentage change of high gamma amplitude relative to that during a reference period. This analytic approach was based on the assumption that the patient was resting during the reference period between trials.
Laserna, C. M., Seih, Y.-T. & Pennebaker, J. W. Um.. who like says you know. J. Lang. Soc. Psychol. 33, 328–338 (2014).
Dockrell, J. E., Messer, D., George, R. & Wilson, G. Children with word-finding difficulties–prevalence, presentation and naming problems. Int J Lang Commun Disord 33, 445–454 (1998).
Christenfeld, N. & Creager, B. Anxiety, alcohol, aphasia, and ums. JPSP 70, 451–460 (1996).
Oomen, C. C. E. & Postma, A. Effects of divided attention on the production of filled pauses and repetitions. J. Speech. Lang. Hear. Res. 44, 997–1004 (2001).
Norris, M. R. & Drummond, S. S. Communicative functions of laughter in aphasia. J. Neurolinguist. 11, 391–402 (1998).
Tomokiyo, L. M. Linguistic properties of non-native speech. in 3, 1335–1338 (IEEE, 2000).
Saul, M. Caroline Kennedy no whiz with words. New York Daily News. https://www.nydailynews.com/news/politics/caroline-kennedy-no-whiz-words-article-1.355586. Accessed 3 July 2020 (2008).
Matsumoto, K. et al. Frequency and neural correlates of pauses in patients with formal thought disorder. Front. Psychiatry 4, 1–9 (2013).
Eklund, R. & Ingvar, M. Supplementary Motor Area Activation in Disfluency Perception: An fMRI Study of Listener Neural Responses to Spontaneously Produced Unfilled and Filled Pauses. in 2016, 1378–1381 (ISCA, 2016).
Kambara, T., Brown, E. C., Silverstein, B. H., Nakai, Y. & Asano, E. Neural dynamics of verbal working memory in auditory description naming. Sci. Rep. 8, 1–12 (2018).
Uematsu, M., Matsuzaki, N., Brown, E. C., Kojima, K. & Asano, E. Human occipital cortices differentially exert saccadic suppression: intracranial recording in children. NeuroImage 83, 224–236 (2013).
Crone, N. E., Korzeniewska, A. & Franaszczuk, P. J. Cortical gamma responses: searching high and low. Int. J. Psychophysiol. 79, 9–15 (2011).
Ray, S., Crone, N. E., Niebur, E., Franaszczuk, P. J. & Hsiao, S. S. Neural correlates of high-gamma oscillations (60–200 Hz) in macaque local field potentials and their potential implications in electrocorticography. J. Neurosci. 28, 11526–11536 (2008).
Scheeringa, R. et al. Neuronal dynamics underlying high- and low- frequency EEG oscillations contribute independently to the human BOLD signal. Neuron 69, 572–583 (2011).
Nishida, M., Juhász, C., Sood, S., Chugani, H. T. & Asano, E. Cortical glucose metabolism positively correlates with gamma-oscillations in nonlesional focal epilepsy. NeuroImage 42, 1275–1284 (2008).
Arya, R., Horn, P. S. & Crone, N. E. ECoG high-gamma modulation versus electrical stimulation for presurgical language mapping. Epilepsy Behav. 79, 26–33 (2018).
Shmuel, A., Augath, M., Oeltermann, A. & Logothetis, N. K. Negative functional MRI response correlates with decreases in neuronal activity in monkey visual area V1. Nat. Neurosci. 9, 569–577 (2006).
Flinker, A. et al. Single-trial speech suppression of auditory cortex activity in humans. J. Neurosci. 30, 16643–16650 (2010).
Ball, T., Kern, M., Mutschler, I., Aertsen, A. & Schulze-Bonhage, A. Signal quality of simultaneously recorded invasive and non-invasive EEG. NeuroImage 46, 708–716 (2009).
Thompson-Schill, S. L., D’Esposito, M. & Kan, I. P. Effects of repetition and competition on activity in left prefrontal cortex during word generation. Neuron 23, 513–522 (1999).
Crosson, B. et al. Relative shift in activity from medial to lateral frontal cortex during internally versus externally guided word generation. J. Cognit. Neurosci. 13, 272–283 (2001).
Holland, S. K. et al. Normal fMRI brain activation patterns in children performing a verb generation task. NeuroImage 14, 837–843 (2001).
Costafreda, S. G. et al. A systematic review and quantitative appraisal of fMRI studies of verbal fluency: role of the left inferior frontal gyrus. Hum. Brain Mapp. 27, 799–810 (2006).
Grèzes, J. & Decety, J. Functional anatomy of execution, mental simulation, observation, and verb generation of actions: a meta-analysis. Hum. Brain Mapp. 12, 1–19 (2001).
Wagner, S., Sebastian, A., Lieb, K., Tüscher, O. & Tadić, A. A coordinate-based ALE functional MRI meta-analysis of brain activation during verbal fluency tasks in healthy control subjects. BMC Neurosci. 15, 78–13 (2014).
Love, T., Swinney, D., Wong, E. & Buxton, R. Perfusion imaging and stroke: a more sensitive measure of the brain bases of cognitive deficits. Aphasiology 16, 873–883 (2002).
Wilson, S. M. et al. Connected speech production in three variants of primary progressive aphasia. Brain 133, 2069–2088 (2010).
Mack, J. E. et al. What do pauses in narrative production reveal about the nature of word retrieval deficits in PPA?. Neuropsychologia 77, 211–222 (2015).
Thothathiri, M., Schwartz, M. F. & Thompson-Schill, S. L. Selection for position: the role of left ventrolateral prefrontal cortex in sequencing language. Brain Lang. 113, 28–38 (2010).
Nakai, Y. et al. Three- and four-dimensional mapping of speech and language in patients with epilepsy. Brain 140, 1351–1370 (2017).
Asano, E., Juhász, C., Shah, A., Sood, S. & Chugani, H. T. Role of subdural electrocorticography in prediction of long-term seizure outcome in epilepsy surgery. Brain 132, 1038–1047 (2009).
Flinker, A. et al. Redefining the role of Broca’s area in speech. Proc. Natl. Acad. Sci. U.S.A. 112, 2871–2875 (2015).
Forseth, K. J. et al. A lexical semantic hub for heteromodal naming in middle fusiform gyrus. Brain 141, 2112–2126 (2018).
Kambara, T. et al. Presurgical language mapping using event-related high-gamma activity: The Detroit procedure. Clin. Neurophysiol. 129, 145–154 (2018).
Desikan, R. S. et al. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. NeuroImage 31, 968–980 (2006).
Nishida, M. et al. Brain network dynamics in the human articulatory loop. Clin. Neurophysiol. 128, 1473–1487 (2017).
Lang, P. J., Bradley, M. M. & Cuthbert, B. N. International affective picture system (IAPS): affective ratings of pictures and instruction manual. Technical Report A-8 (2008).
Bortfeld, H., Leon, S. D., Bloom, J. E., Schober, M. F. & Brennan, S. E. Disfluency rates in conversation: effects of age, relationship, topic, role, and gender. Lang. Speech 44, 123–147 (2001).
Brown, E. C. et al. In vivo animation of auditory-language-induced gamma-oscillations in children with intractable focal epilepsy. NeuroImage 41, 1120–1131 (2008).
Papp, N. & Ktonas, P. Critical evaluation of complex demodulation techniques for the quantification of bioelectrical activity. Biomed. Sci. Instrum. 13, 135–145 (1977).
Hoechstetter, K. et al. BESA source coherence: a new method to study cortical oscillatory coupling. Brain Topogr. 16, 233–238 (2004).
Shelton, J. R. & Caramazza, A. Deficits in lexical and semantic processing: implications for models of normal language. Psychon. Bull. Rev. 6, 5–27 (1999).
Dell, G. S. & O’Seaghdha, P. G. Stages of lexical access in language production. Cognition 42, 287–314 (1992).
Nakai, Y. et al. Four-dimensional functional cortical maps of visual and auditory language: Intracranial recording. Epilepsia 60, 255–267 (2019).
Hamberger, M. J., Habeck, C. G., Pantazatos, S. P., Williams, A. C. & Hirsch, J. Shared space, separate processes: Neural activation patterns for auditory description and visual object naming in healthy adults. Hum. Brain Mapp. 35, 2507–2520 (2014).
Kambara, T. et al. Spatio-temporal dynamics of working memory maintenance and scanning of verbal information. Clin. Neurophysiol. 128, 882–891 (2017).
Chen, S. H. A. & Desmond, J. E. Cerebrocerebellar networks during articulatory rehearsal and verbal working memory tasks. NeuroImage 24, 332–338 (2005).
Paulesu, E., Frith, C. D. & Frackowiak, R. S. J. The neural correlates of the verbal component of working memory. Nature 362, 342–345 (1993).
Fox Tree, J. E. Folk notions of um and uh, you know, and like. Text Talk Interdiscip. J. Lang. Discourse Commun. Stud. 27, 297–314 (2007).
Maclay, H. & Osgood, C. E. Hesitation phenomena in spontaneous english speech. WORD 15, 19–44 (1959).
Christenfeld, N. Does it hurt to say um?. J. Nonverbal Behav. 19, 171–186 (1995).
Johnson, E. L., Tang, L., Yin, Q., Asano, E. & Ofen, N. Direct brain recordings reveal prefrontal cortex dynamics of memory development. Sci. Adv. 4, eaat3702–13 (2018).
Arya, R. et al. Electrocorticographic language mapping in children by high-gamma synchronization during spontaneous conversation: Comparison with conventional electrical cortical stimulation. Epilepsy Res. 110, 78–87 (2015).
This work was supported by NIH Grant NS064033 (to E.A.).
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Sugiura, A., Alqatan, Z., Nakai, Y. et al. Neural dynamics during the vocalization of ‘uh’ or ‘um’. Sci Rep 10, 11987 (2020). https://doi.org/10.1038/s41598-020-68606-x