Categorical speech representation in human superior temporal gyrus

Abstract

Speech perception requires the rapid and effortless extraction of meaningful phonetic information from a highly variable acoustic signal. A powerful example of this phenomenon is categorical speech perception, in which a continuum of acoustically varying sounds is transformed into perceptually distinct phoneme categories. We found that the neural representation of speech sounds is categorically organized in the human posterior superior temporal gyrus. Using intracranial high-density cortical surface arrays, we found that listening to synthesized speech stimuli varying in small and acoustically equal steps evoked distinct and invariant cortical population response patterns that were organized by their sensitivities to critical acoustic features. Phonetic category boundaries were similar between neurometric and psychometric functions. Although speech-sound responses were distributed, spatially discrete cortical loci were found to underlie specific phonetic discrimination. Our results provide direct evidence for acoustic-to–higher order phonetic level encoding of speech sounds in human language receptive cortex.

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Psychophysics of categorical speech perception and speech-evoked responses during intraoperative human cortical recordings.
Figure 2: Categorical organization of neural response patterns to a speech-stimulus continuum.
Figure 3: Correlation of neurometric and psychometric category boundaries.
Figure 4: Topography of discriminative cortical sites in the pSTG underlying categorical speech perception.

References

  1. 1

    Perkell, J. & Klatt, D.H. Invariance and Variability in Speech Processes (Lawrence Erlbaum Associates, Hillsdale, New Jersey, 1986).

  2. 2

    Liberman, A.M., Cooper, F.S., Shankweiler, D.P. & Studdert-Kennedy, M. Perception of the speech code. Psychol. Rev. 74, 431–461 (1967).

    CAS  Article  Google Scholar 

  3. 3

    Diehl, R.L., Lotto, A.J. & Holt, L.L. Speech perception. Annu. Rev. Psychol. 55, 149–179 (2004).

    Article  Google Scholar 

  4. 4

    Liberman, A.M. & Mattingly, I.G. A specialization for speech perception. Science 243, 489–494 (1989).

    CAS  Article  Google Scholar 

  5. 5

    Vihman, M. Phonological Development: The Origins of Language in the Child (Wiley-Blackwell, Cambridge, 1996).

  6. 6

    Liberman, A.M., Harris, K.S., Hoffman, H.S. & Griffith, B.C. The discrimination of speech sounds within and across phoneme boundaries. J. Exp. Psychol. 54, 358–368 (1957).

    CAS  Article  Google Scholar 

  7. 7

    Harnad, S.R. Categorical Perception: The Groundwork of Cognition (Cambridge University Press, Cambridge, 1987).

  8. 8

    Edwards, E. et al. Spatiotemporal imaging of cortical activation during verb generation and picture naming. Neuroimage 50, 291–301 (2010).

    Article  Google Scholar 

  9. 9

    Creutzfeldt, O., Ojemann, G. & Lettich, E. Neuronal activity in the human lateral temporal lobe. I. Responses to speech. Exp. Brain Res. 77, 451–475 (1989).

    CAS  Article  Google Scholar 

  10. 10

    Boatman, D., Lesser, R.P. & Gordon, B. Auditory speech processing in the left temporal lobe: an electrical interference study. Brain Lang. 51, 269–290 (1995).

    CAS  Article  Google Scholar 

  11. 11

    Liebenthal, E., Binder, J.R., Spitzer, S.M., Possing, E.T. & Medler, D.A. Neural substrates of phonemic perception. Cereb. Cortex 15, 1621–1631 (2005).

    Article  Google Scholar 

  12. 12

    Crone, N.E., Boatman, D., Gordon, B. & Hao, L. Induced electrocorticographic gamma activity during auditory perception. Brazier Award-winning article, 2001. Clin. Neurophysiol. 112, 565–582 (2001).

    CAS  Article  Google Scholar 

  13. 13

    Howard, M.A. et al. Auditory cortex on the human posterior superior temporal gyrus. J. Comp. Neurol. 416, 79–92 (2000).

    CAS  Article  Google Scholar 

  14. 14

    Penfield, W. & Jasper, H. Epilepsy and the Functional Anatomy of the Human Brain (LIttle, Brown and Company, Boston, 1954).

  15. 15

    Haglund, M.M., Berger, M.S., Shamseldin, M., Lettich, E. & Ojemann, G.A. Cortical localization of temporal lobe language sites in patients with gliomas. Neurosurgery 34, 567–576 discussion 576 (1994).

    CAS  PubMed  Google Scholar 

  16. 16

    Merzenich, M.M. & Brugge, J.F. Representation of the cochlear partition of the superior temporal plane of the macaque monkey. Brain Res. 50, 275–296 (1973).

    CAS  Article  Google Scholar 

  17. 17

    Koh, K., Kim, S.J. & Boyd, S. An interior-point method for large-scale l1-regularized least squares. J. Mach. Learn. Res. 8, 1519–1555 (2007).

    Google Scholar 

  18. 18

    Miller, G.A. & Nicely, P.E. An analysis of perceptual confusions among some English consonants. J. Acoust. Soc. Am. 27, 338–352 (1955).

    Article  Google Scholar 

  19. 19

    Iverson, P. & Kuhl, P.K. Perceptual magnet and phoneme boundary effects in speech perception: do they arise from a common mechanism? Percept. Psychophys. 62, 874–886 (2000).

    CAS  Article  Google Scholar 

  20. 20

    Liberman, A.M. & Whalen, D.H. On the relation of speech to language. Trends Cogn. Sci. 4, 187–196 (2000).

    CAS  Article  Google Scholar 

  21. 21

    Binder, J.R. et al. Human temporal lobe activation by speech and nonspeech sounds. Cereb. Cortex 10, 512–528 (2000).

    CAS  Article  Google Scholar 

  22. 22

    Benson, R.R., Richardson, M., Whalen, D.H. & Lai, S. Phonetic processing areas revealed by sinewave speech and acoustically similar non-speech. Neuroimage 31, 342–353 (2006).

    Article  Google Scholar 

  23. 23

    Uppenkamp, S., Johnsrude, I.S., Norris, D., Marslen-Wilson, W. & Patterson, R.D. Locating the initial stages of speech-sound processing in human temporal cortex. Neuroimage 31, 1284–1296 (2006).

    Article  Google Scholar 

  24. 24

    Vouloumanos, A., Kiehl, K.A., Werker, J.F. & Liddle, P.F. Detection of sounds in the auditory stream: event-related fMRI evidence for differential activation to speech and nonspeech. J. Cogn. Neurosci. 13, 994–1005 (2001).

    CAS  Article  Google Scholar 

  25. 25

    Jäncke, L., Wustenberg, T., Scheich, H. & Heinze, H.J. Phonetic perception and the temporal cortex. Neuroimage 15, 733–746 (2002).

    Article  Google Scholar 

  26. 26

    Scott, S.K. & Wise, R.J. The functional neuroanatomy of prelexical processing in speech perception. Cognition 92, 13–45 (2004).

    Article  Google Scholar 

  27. 27

    Hickok, G. & Poeppel, D. Dorsal and ventral streams: a framework for understanding aspects of the functional anatomy of language. Cognition 92, 67–99 (2004).

    Article  Google Scholar 

  28. 28

    Whalen, D.H. et al. Differentiation of speech and nonspeech processing within primary auditory cortex. J. Acoust. Soc. Am. 119, 575–581 (2006).

    CAS  Article  Google Scholar 

  29. 29

    Desai, R., Liebenthal, E., Waldron, E. & Binder, J.R. Left posterior temporal regions are sensitive to auditory categorization. J. Cogn. Neurosci. 20, 1174–1188 (2008).

    Article  Google Scholar 

  30. 30

    Raizada, R.D. & Poldrack, R.A. Selective amplification of stimulus differences during categorical processing of speech. Neuron 56, 726–740 (2007).

    CAS  Article  Google Scholar 

  31. 31

    Blumstein, S.E., Myers, E.B. & Rissman, J. The perception of voice onset time: an fMRI investigation of phonetic category structure. J. Cogn. Neurosci. 17, 1353–1366 (2005).

    Article  Google Scholar 

  32. 32

    Blumstein, S.E. & Stevens, K.N. Perceptual invariance and onset spectra for stop consonants in different vowel environments. J. Acoust. Soc. Am. 67, 648–662 (1980).

    CAS  Article  Google Scholar 

  33. 33

    Iverson, P. & Kuhl, P.K. Mapping the perceptual magnet effect for speech using signal detection theory and multidimensional scaling. J. Acoust. Soc. Am. 97, 553–562 (1995).

    CAS  Article  Google Scholar 

  34. 34

    Kruskal, J.B. & Wish, M. Multidimensional Scaling (Sage Publications, Newbury Park, California, 1978).

  35. 35

    Shepard, R.N. Multidimensional scaling, tree-fitting and clustering. Science 210, 390–398 (1980).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We are grateful to the four individuals who participated in this experiment and to A. Flinker for help with data acquisition. This research was supported by US National Institutes of Health grants NS21135 (R.T.K.), PO4813 (R.T.K.), F32NS061552 (E.F.C.), K99NS065120 (E.F.C.), FKZ-MK48-2009/003 (J.W.R.) and RI1511/1-3 (J.W.R.).

Author information

Affiliations

Authors

Contributions

E.F.C. designed the experiments, collected the data and wrote the manuscript. E.F.C. and J.W.R. analyzed the data, evaluated results and edited the manuscript. J.W.R., N.M.B. and M.S.B. helped with data collection. K.J. and R.T.K. reviewed the manuscript.

Corresponding authors

Correspondence to Edward F Chang or Jochem W Rieger.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–4, Supplementary Table 1 and Supplementary Results (PDF 221 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Chang, E., Rieger, J., Johnson, K. et al. Categorical speech representation in human superior temporal gyrus. Nat Neurosci 13, 1428–1432 (2010). https://doi.org/10.1038/nn.2641

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing