Speech perception requires the rapid and effortless extraction of meaningful phonetic information from a highly variable acoustic signal. A powerful example of this phenomenon is categorical speech perception, in which a continuum of acoustically varying sounds is transformed into perceptually distinct phoneme categories. We found that the neural representation of speech sounds is categorically organized in the human posterior superior temporal gyrus. Using intracranial high-density cortical surface arrays, we found that listening to synthesized speech stimuli varying in small and acoustically equal steps evoked distinct and invariant cortical population response patterns that were organized by their sensitivities to critical acoustic features. Phonetic category boundaries were similar between neurometric and psychometric functions. Although speech-sound responses were distributed, spatially discrete cortical loci were found to underlie specific phonetic discrimination. Our results provide direct evidence for acoustic-to–higher order phonetic level encoding of speech sounds in human language receptive cortex.
Subscribe to Journal
Get full journal access for 1 year
only $17.42 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Perkell, J. & Klatt, D.H. Invariance and Variability in Speech Processes (Lawrence Erlbaum Associates, Hillsdale, New Jersey, 1986).
Liberman, A.M., Cooper, F.S., Shankweiler, D.P. & Studdert-Kennedy, M. Perception of the speech code. Psychol. Rev. 74, 431–461 (1967).
Diehl, R.L., Lotto, A.J. & Holt, L.L. Speech perception. Annu. Rev. Psychol. 55, 149–179 (2004).
Liberman, A.M. & Mattingly, I.G. A specialization for speech perception. Science 243, 489–494 (1989).
Vihman, M. Phonological Development: The Origins of Language in the Child (Wiley-Blackwell, Cambridge, 1996).
Liberman, A.M., Harris, K.S., Hoffman, H.S. & Griffith, B.C. The discrimination of speech sounds within and across phoneme boundaries. J. Exp. Psychol. 54, 358–368 (1957).
Harnad, S.R. Categorical Perception: The Groundwork of Cognition (Cambridge University Press, Cambridge, 1987).
Edwards, E. et al. Spatiotemporal imaging of cortical activation during verb generation and picture naming. Neuroimage 50, 291–301 (2010).
Creutzfeldt, O., Ojemann, G. & Lettich, E. Neuronal activity in the human lateral temporal lobe. I. Responses to speech. Exp. Brain Res. 77, 451–475 (1989).
Boatman, D., Lesser, R.P. & Gordon, B. Auditory speech processing in the left temporal lobe: an electrical interference study. Brain Lang. 51, 269–290 (1995).
Liebenthal, E., Binder, J.R., Spitzer, S.M., Possing, E.T. & Medler, D.A. Neural substrates of phonemic perception. Cereb. Cortex 15, 1621–1631 (2005).
Crone, N.E., Boatman, D., Gordon, B. & Hao, L. Induced electrocorticographic gamma activity during auditory perception. Brazier Award-winning article, 2001. Clin. Neurophysiol. 112, 565–582 (2001).
Howard, M.A. et al. Auditory cortex on the human posterior superior temporal gyrus. J. Comp. Neurol. 416, 79–92 (2000).
Penfield, W. & Jasper, H. Epilepsy and the Functional Anatomy of the Human Brain (LIttle, Brown and Company, Boston, 1954).
Haglund, M.M., Berger, M.S., Shamseldin, M., Lettich, E. & Ojemann, G.A. Cortical localization of temporal lobe language sites in patients with gliomas. Neurosurgery 34, 567–576 discussion 576 (1994).
Merzenich, M.M. & Brugge, J.F. Representation of the cochlear partition of the superior temporal plane of the macaque monkey. Brain Res. 50, 275–296 (1973).
Koh, K., Kim, S.J. & Boyd, S. An interior-point method for large-scale l1-regularized least squares. J. Mach. Learn. Res. 8, 1519–1555 (2007).
Miller, G.A. & Nicely, P.E. An analysis of perceptual confusions among some English consonants. J. Acoust. Soc. Am. 27, 338–352 (1955).
Iverson, P. & Kuhl, P.K. Perceptual magnet and phoneme boundary effects in speech perception: do they arise from a common mechanism? Percept. Psychophys. 62, 874–886 (2000).
Liberman, A.M. & Whalen, D.H. On the relation of speech to language. Trends Cogn. Sci. 4, 187–196 (2000).
Binder, J.R. et al. Human temporal lobe activation by speech and nonspeech sounds. Cereb. Cortex 10, 512–528 (2000).
Benson, R.R., Richardson, M., Whalen, D.H. & Lai, S. Phonetic processing areas revealed by sinewave speech and acoustically similar non-speech. Neuroimage 31, 342–353 (2006).
Uppenkamp, S., Johnsrude, I.S., Norris, D., Marslen-Wilson, W. & Patterson, R.D. Locating the initial stages of speech-sound processing in human temporal cortex. Neuroimage 31, 1284–1296 (2006).
Vouloumanos, A., Kiehl, K.A., Werker, J.F. & Liddle, P.F. Detection of sounds in the auditory stream: event-related fMRI evidence for differential activation to speech and nonspeech. J. Cogn. Neurosci. 13, 994–1005 (2001).
Jäncke, L., Wustenberg, T., Scheich, H. & Heinze, H.J. Phonetic perception and the temporal cortex. Neuroimage 15, 733–746 (2002).
Scott, S.K. & Wise, R.J. The functional neuroanatomy of prelexical processing in speech perception. Cognition 92, 13–45 (2004).
Hickok, G. & Poeppel, D. Dorsal and ventral streams: a framework for understanding aspects of the functional anatomy of language. Cognition 92, 67–99 (2004).
Whalen, D.H. et al. Differentiation of speech and nonspeech processing within primary auditory cortex. J. Acoust. Soc. Am. 119, 575–581 (2006).
Desai, R., Liebenthal, E., Waldron, E. & Binder, J.R. Left posterior temporal regions are sensitive to auditory categorization. J. Cogn. Neurosci. 20, 1174–1188 (2008).
Raizada, R.D. & Poldrack, R.A. Selective amplification of stimulus differences during categorical processing of speech. Neuron 56, 726–740 (2007).
Blumstein, S.E., Myers, E.B. & Rissman, J. The perception of voice onset time: an fMRI investigation of phonetic category structure. J. Cogn. Neurosci. 17, 1353–1366 (2005).
Blumstein, S.E. & Stevens, K.N. Perceptual invariance and onset spectra for stop consonants in different vowel environments. J. Acoust. Soc. Am. 67, 648–662 (1980).
Iverson, P. & Kuhl, P.K. Mapping the perceptual magnet effect for speech using signal detection theory and multidimensional scaling. J. Acoust. Soc. Am. 97, 553–562 (1995).
Kruskal, J.B. & Wish, M. Multidimensional Scaling (Sage Publications, Newbury Park, California, 1978).
Shepard, R.N. Multidimensional scaling, tree-fitting and clustering. Science 210, 390–398 (1980).
We are grateful to the four individuals who participated in this experiment and to A. Flinker for help with data acquisition. This research was supported by US National Institutes of Health grants NS21135 (R.T.K.), PO4813 (R.T.K.), F32NS061552 (E.F.C.), K99NS065120 (E.F.C.), FKZ-MK48-2009/003 (J.W.R.) and RI1511/1-3 (J.W.R.).
The authors declare no competing financial interests.
About this article
Cite this article
Chang, E., Rieger, J., Johnson, K. et al. Categorical speech representation in human superior temporal gyrus. Nat Neurosci 13, 1428–1432 (2010). https://doi.org/10.1038/nn.2641
A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
BMC Neuroscience (2021)
Scientific Reports (2021)
Brain and Language (2020)
Neural correlates of the processing of self-adaptors, emblems, and iconic gestures with speech: an fMRI study
Language, Cognition and Neuroscience (2020)
Frontiers in Neuroscience (2020)