Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

The what, where and how of auditory-object perception

Subjects

Key Points

  • An auditory object is a perceptual construct, corresponding to the sound that can be assigned to a particular acoustic source. An auditory object spans acoustic events that unfold over time, and a sequence of objects forms a 'stream': for example, when a person is walking, the sound of each step is a unique auditory object but the temporal sequence of footsteps is linked together to form a stream.

  • An auditory object is constructed from the spectrotemporal regularities in the acoustic environment. More specifically, an auditory stimulus comes into our awareness as a sound as a result of the simultaneous and sequential principles that group the acoustic features of the auditory stimulus into stable spectrotemporal entities.

  • Auditory-object processing occurs in the cortex. In particular, the ventral auditory pathway mediates the computations underlying a listener's ability to perceive a sound (auditory object), whereas object-related information that is found in the dorsal pathway is used in the pursuit of audiomotor behaviours.

  • Neural correlates of the perception of an auditory object are found in the auditory cortex. Whereas some studies indicate that the ventral pathway contains brain regions specialized for auditory-object processing, auditory perception is most likely to be mediated by a broad network of brain areas in this pathway.

  • A hallmark of auditory-object processing is that it can be influenced by attention and that attention can act on the object itself and not the lower-level spectrotemporal details of the auditory stimulus. Both single-unit and functional imaging studies demonstrate the effects of attention on the representation of auditory objects in the auditory cortex.

Abstract

The fundamental perceptual unit in hearing is the 'auditory object'. Similar to visual objects, auditory objects are the computational result of the auditory system's capacity to detect, extract, segregate and group spectrotemporal regularities in the acoustic environment; the multitude of acoustic stimuli around us together form the auditory scene. However, unlike the visual scene, resolving the component objects within the auditory scene crucially depends on their temporal structure. Neural correlates of auditory objects are found throughout the auditory system. However, neural responses do not become correlated with a listener's perceptual reports until the level of the cortex. The roles of different neural structures and the contribution of different cognitive states to the perception of auditory objects are not yet fully understood.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: The transformation of an acoustic stimulus into a perceptual representation of a sound.
Figure 2: Dual pathways of information flow in the auditory system and the organization of the auditory cortex.
Figure 3: Auditory streaming.
Figure 4: Categorization in the ventral auditory pathway.
Figure 5: Strategies for coding auditory object identity.

Similar content being viewed by others

References

  1. Griffiths, T. D. & Warren, J. D. What is an auditory object? Nature Rev. Neurosci. 5, 887–892 (2004).

    Article  CAS  Google Scholar 

  2. Rauschecker, J. P. Processing of complex sounds in the auditory cortex of cat, monkey, and man. Acta Otolaryngol. Suppl. 532, 34–38 (1997).

    Article  CAS  PubMed  Google Scholar 

  3. Kaas, J. H. & Hackett, T. A. Subdivisions of auditory cortex and processing streams in primates. Proc. Natl Acad. Sci. USA 97, 11793–11799 (2000).

    Article  CAS  PubMed  Google Scholar 

  4. Romanski, L. M. et al. Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nature Neurosci. 2, 1131–1136 (1999).

    Article  CAS  PubMed  Google Scholar 

  5. Rauschecker, J. P. & Tian, B. Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proc. Natl Acad. Sci. USA 97, 11800–11806 (2000).

    Article  CAS  PubMed  Google Scholar 

  6. Rauschecker, J. P. & Scott, S. K. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nature Neurosci. 12, 718–724 (2009).

    Article  CAS  PubMed  Google Scholar 

  7. Recanzone, G. H. & Cohen, Y. E. Serial and parallel processing in the primate auditory cortex revisited. Behav. Brain Res. 206, 1–7 (2010).

    Article  PubMed  Google Scholar 

  8. Sharpee, T. O., Atencio, C. A. & Schreiner, C. E. Hierarchical representations in the auditory cortex. Curr. Opin. Neurobiol. 21, 761–767 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Bendor, D. & Wang, X. Cortical representations of pitch in monkeys and humans. Curr. Opin. Neurobiol. 16, 391–399 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Fishman, Y. I. & Steinschneider, M. in The Oxford Handbook of Auditory Science: the Auditory Brain (ed. Rees, A.) 215–245 (Oxford Univ. Press, 2010).

    Google Scholar 

  11. Bregman, A. S. Auditory Scene Analysis (MIT Press, 1990).

    Book  Google Scholar 

  12. Winkler, I., Denham, S. L. & Nelken, I. Modeling the auditory scene: predictive regularity representations and perceptual objects. Trends Cogn. Sci. 13, 532–540 (2009).

    Article  PubMed  Google Scholar 

  13. Kubovy, M. & Van Valkenburg, D. Auditory and visual objects. Cognition 80, 97–126 (2001).

    Article  CAS  PubMed  Google Scholar 

  14. Shinn-Cunningham, B. G. Object-based auditory and visual attention. Trends Cogn. Sci. 12, 182–186 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  15. Schnupp, J. W., Nelken, I. & King, A. J. Auditory Neuroscience: Making Sense of Sound (MIT Press, 2012).

    Google Scholar 

  16. Miller, C. T. & Cohen, Y. E. in Primate Neuroethology (eds Ghazanfar, A. & Platt, M. L.) 237–255 (Oxford Univ. Press, 2010).

    Book  Google Scholar 

  17. Alain, C. & Arnott, S. R. Selectively attending to auditory objects. Front. Biosci. 5, D202–D212 (2000).

    Article  CAS  PubMed  Google Scholar 

  18. DiCarlo, J. J., Zoccolan, D. & Rust, N. C. How does the brain solve visual object recognition? Neuron 73, 415–434 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Ding, N. & Simon, J. Z. Emergence of neural encoding of auditory objects while listening to competing speakers. Proc. Natl Acad. Sci. USA 109, 11854–11859 (2012).

    Article  PubMed  Google Scholar 

  20. Reddy, L. & Kanwisher, N. Coding of visual objects in the ventral stream. Curr. Opin. Neurobiol. 16, 408–414 (2006).

    Article  CAS  PubMed  Google Scholar 

  21. Miller, C. T., Dibble, E. & Hauser, M. D. Amodal completion of acoustic signals by a nonhuman primate. Nature Neurosci. 4, 783–784 (2001).

    Article  CAS  PubMed  Google Scholar 

  22. Petkov, C. I., O'Connor, K. N. & Sutter, M. L. Encoding of illusory continuity in primary auditory cortex. Neuron 54, 153–165 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Bendixen, A., Schroger, E. & Winkler, I. I heard that coming: event-related potential evidence for stimulus-driven prediction in the auditory system. J. Neurosci. 29, 8447–8451 (2009). The authors propose a key role for the auditory cortex in the generation of predictions about sequences of ongoing sounds. ERP recordings demonstrate that the neural response to a predictable but omitted sound looks very similar to the neural response to the tone when actually present.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Shinn-Cunningham, B. G. & Wang, D. Influences of auditory object formation on phonemic restoration. J. Acoust. Soc. Am. 123, 295–301 (2008).

    Article  PubMed  Google Scholar 

  25. Warren, R. M., Obusek, C. J. & Ackroff, J. M. Auditory induction: perceptual synthesis of absent sounds. Science 176, 1149–1151 (1972).

    Article  CAS  PubMed  Google Scholar 

  26. Micheyl, C. et al. The neurophysiological basis of the auditory continuity illusion: a mismatch negativity study. J. Cogn. Neurosci. 15, 747–758 (2003).

    Article  PubMed  Google Scholar 

  27. Ungerleider, L. G. & Mishkin, M. in Analysis of Visual Behavior (eds Ingle, D. J., Goodale, M. A. & Mansfield, R. J.) 549–586 (MIT Press, 1982).

    Google Scholar 

  28. Rust, N. C. & Stocker, A. A. Ambiguity and invariance: two fundamental challenges for visual processing. Curr. Opin. Neurobiol. 20, 382–388 (2010).

    Article  CAS  PubMed  Google Scholar 

  29. Ison, M. J. & Quiroga, R. Q. Selectivity and invariance for visual object perception. Front. Biosci. 13, 4889–4903 (2008).

    Article  PubMed  Google Scholar 

  30. Riesenhuber, M. & Poggio, T. Neural mechanisms of object recognition. Curr. Opin. Neurobiol. 12, 162–168 (2002).

    Article  CAS  PubMed  Google Scholar 

  31. Riesenhuber, M. & Poggio, T. Models of object recognition. Nature Neurosci. 3, 1199–1204 (2000).

    Article  CAS  PubMed  Google Scholar 

  32. Tian, B., Reser, D., Durham, A., Kustov, A. & Rauschecker, J. P. Functional specialization in rhesus monkey auditory cortex. Science 292, 290–293 (2001).

    Article  CAS  PubMed  Google Scholar 

  33. Alain, C., Arnott, S. R., Hevenor, S., Graham, S. & Grady, C. L. “What” and “where” in the human auditory system. Proc. Natl Acad. Sci. USA 98, 12301–12306 (2001).

    Article  CAS  PubMed  Google Scholar 

  34. Maeder, P. P. et al. Distinct pathways involved in sound recognition and localization: a human fMRI study. Neuroimage 14, 802–816 (2001).

    Article  CAS  PubMed  Google Scholar 

  35. Arnott, S. R., Binns, M. A., Grady, C. L. & Alain, C. Assessing the auditory dual-pathway model in humans. Neuroimage 22, 401–408 (2004).

    Article  PubMed  Google Scholar 

  36. Obleser, J. et al. Vowel sound extraction in anterior superior temporal cortex. Hum. Brain Mapp. 27, 562–571 (2006).

    Article  PubMed  Google Scholar 

  37. Chang, E. F. et al. Categorical speech representation in human superior temporal gyrus. Nature Neurosci. 13, 1428–1432 (2010).

    Article  CAS  PubMed  Google Scholar 

  38. Binder, J. R., Liebenthal, E., Possing, E. T., Medler, D. A. & Ward, B. D. Neural correlates of sensory and decision processes in auditory object identification. Nature Neurosci. 7, 295–301 (2004). The authors attempt to identify both sensory and decision-making activity in the human brain using fMRI. They demonstrate a functional distinction between sensory and decision mechanisms underlying auditory-object identification.

    Article  CAS  PubMed  Google Scholar 

  39. Hill, K. T. & Miller, L. M. Auditory attentional control and selection during cocktail party listening. Cereb. Cortex 20, 583–590 (2010).

    Article  PubMed  Google Scholar 

  40. Lee, A. K. et al. Auditory selective attention reveals preparatory activity in different cortical regions for selection based on source location and source pitch. Front. Neurosci. 6, 190 (2012). The authors combined magnetoencephalography recordings and structural MRI data to map the attentional networks involved in selectively attending to either spatial or non-spatial features of a sound. Left frontal eye fields were activated by spatial attention, whereas lateral posterior superior temporal sulcus was activated by attention to pitch.

    PubMed  Google Scholar 

  41. Cohen, Y. E. et al. A functional role for the ventrolateral prefrontal cortex in non-spatial auditory cognition. Proc. Natl Acad. Sci. USA 106, 20045–20050 (2009).

    Article  PubMed  Google Scholar 

  42. Obleser, J. & Eisner, F. Pre-lexical abstraction of speech in the auditory cortex. Trends Cogn. Sci. 13, 14–19 (2009).

    Article  PubMed  Google Scholar 

  43. Rauschecker, J. P. Ventral and dorsal streams in the evolution of speech and language. Front. Evol. Neurosci. 4, 7 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Bizley, J. K., Walker, K. M., Silverman, B. W., King, A. J. & Schnupp, J. W. Interdependent encoding of pitch, timbre, and spatial location in auditory cortex. J. Neurosci. 29, 2064–2075 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Miller, L. M. & Recanzone, G. H. Populations of auditory cortical neurons can accurately encode acoustic space across stimulus intensity. Proc. Natl Acad. Sci. USA 106, 5931–5935 (2009). The authors measured neural responses to sounds that varied in spatial location and used optimal decoding strategies to assess whether neural responses could support behavioural localization abilities. Although neural populations throughout the auditory cortex contained spatial information in their responses, only those in the caudolateral field had sufficient information to account for behaviour.

    Article  PubMed  Google Scholar 

  46. Stecker, G. C. & Middlebrooks, J. C. Distributed coding of sound locations in the auditory cortex. Biol. Cybern. 89, 341–349 (2003).

    Article  PubMed  Google Scholar 

  47. Harrington, I. A., Stecker, G. C., Macpherson, E. A. & Middlebrooks, J. C. Spatial sensitivity of neurons in the anterior, posterior, and primary fields of cat auditory cortex. Hear. Res. 240, 22–41 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  48. Cloutman, L. L. Interaction between dorsal and ventral processing streams: where, when and how? Brain Lang. http://dx.doi.org/10.1016/j.bandl.2012.08.003 (2012).

  49. Middlebrooks, J. C. & Onsan, Z. A. Stream segregation with high spatial acuity. J. Acoust. Soc. Am. 132, 3896–3911 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Middlebrooks, J. C. & Bremen, P. Spatial stream segregation by auditory cortical neurons. J. Neurosci. 33, 10986–11001 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Rauschecker, J. P. An expanded role for the dorsal auditory pathway in sensorimotor control and integration. Hear. Res. 271, 16–25 (2011).

    Article  PubMed  Google Scholar 

  52. Teki, S., Chait, M., Kumar, S., von Kriegstein, K. & Griffiths, T. D. Brain bases for auditory stimulus-driven figure–ground segregation. J. Neurosci. 31, 164–171 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Leaver, A. M., Van Lare, J., Zielinski, B., Halpern, A. R. & Rauschecker, J. P. Brain activation during anticipation of sound sequences. J. Neurosci. 29, 2477–2485 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Cusack, R. The intraparietal sulcus and perceptual organization. J. Cogn. Neurosci. 17, 641–651 (2005).

    Article  PubMed  Google Scholar 

  55. Rao, S. C., Rainer, G. & Miller, E. K. Integration of what and where in the primate prefrontal cortex. Science 276, 821–824 (1997).

    Article  CAS  PubMed  Google Scholar 

  56. Bendor, D. & Wang, X. The neuronal representation of pitch in primate auditory cortex. Nature 436, 1161–1165 (2005). The authors demonstrate that a subset of neurons — specifically in the low-frequency border of area A1 and the rostral field in the marmoset — respond to sounds with a fundamental frequency that matches their characteristic frequency regardless of whether the fundamental frequency is present or not.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Lee, C. C. & Middlebrooks, J. C. Specialization for sound localization in fields A1, DZ, and PAF of cat auditory cortex. J. Associ. Res. Otolaryngol. 14, 61–82 (2013).

    Article  Google Scholar 

  58. Camalier, C. R., D'Angelo, W. R., Sterbing-D'Angelo, S. J., de la Mothe, L. A. & Hackett, T. A. Neural latencies across auditory cortex of macaque support a dorsal stream supramodal timing advantage in primates. Proc. Natl Acad. Sci. USA 109, 18168–18173 (2012).

    Article  PubMed  Google Scholar 

  59. Grimsley, J. M., Shanbhag, S. J., Palmer, A. R. & Wallace, M. N. Processing of communication calls in guinea pig auditory cortex. PloS ONE 7, e51646 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Patterson, R. D., Uppenkamp, S., Johnsrude, I. S. & Griffiths, T. D. The processing of temporal pitch and melody information in auditory cortex. Neuron 36, 767–776 (2002). The authors present evidence for the hierarchical processing of pitch by performing fMRI on human listeners using sounds that are matched in spectral content but that either did or did not evoke a pitch percept.

    Article  CAS  PubMed  Google Scholar 

  61. Penagos, H., Melcher, J. R. & Oxenham, A. J. A neural representation of pitch salience in nonprimary human auditory cortex revealed with functional magnetic resonance imaging. J. Neurosci. 24, 6810–6815 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Warren, J. D. & Griffiths, T. D. Distinct mechanisms for processing spatial sequences and pitch sequences in the human auditory brain. J. Neurosci. 23, 5799–5804 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Garcia, D., Hall, D. A. & Plack, C. J. The effect of stimulus context on pitch representations in the human auditory cortex. Neuroimage 51, 808–816 (2010).

    Article  PubMed  Google Scholar 

  64. Kumar, S., Stephan, K. E., Warren, J. D., Friston, K. J. & Griffiths, T. D. Hierarchical processing of auditory objects in humans. PLoS Computat. Biol. 3, e100 (2007). The authors present evidence for the hierarchical processing of spectral timbre in human listeners. The use of dynamic causal modelling techniques indicated that processing was both serial and hierarchical.

    Article  CAS  Google Scholar 

  65. Bizley, J. K., Walker, K. M., Nodal, F. R., King, A. J. & Schnupp, J. W. Auditory cortex represents both pitch judgments and the corresponding acoustic cues. Curr. Biol. 23, 620–625 (2013). The authors recorded neural responses in the auditory cortex of ferrets performing a pitch-direction discrimination task. Neural activity was modulated more by the ferrets' decision regarding the pitch of a target sound than by the actual pitch category.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Griffiths, T. D. et al. Direct recordings of pitch responses from human auditory cortex. Curr. Biol. 20, 1128–1132 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Staeren, N., Renvall, H., De Martino, F., Goebel, R. & Formisano, E. Sound categories are represented as distributed patterns in the human auditory cortex. Curr. Biol. 19, 498–502 (2009).

    Article  CAS  PubMed  Google Scholar 

  68. Hall, D. A. & Plack, C. J. Pitch processing sites in the human auditory brain. Cereb. Cortex 19, 576–585 (2009).

    Article  PubMed  Google Scholar 

  69. Bizley, J. K., Walker, K. M., King, A. J. & Schnupp, J. W. Neural ensemble codes for stimulus periodicity in auditory cortex. J. Neurosci. 30, 5078–5091 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Griffiths, T. D. & Hall, D. A. Mapping pitch representation in neural ensembles with fMRI. J. Neurosci. 32, 13343–13347 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Nelken, I. et al. Responses of auditory cortex to complex stimuli: functional organization revealed using intrinsic optical signals. J. Neurophysiol. 99, 1928–1941 (2008).

    Article  PubMed  Google Scholar 

  72. Darwin, C. J. Auditory grouping. Trends Cogn. Sci. 1, 327–333 (1997).

    Article  CAS  PubMed  Google Scholar 

  73. Hackett, T. A. Information flow in the auditory cortical network. Hear. Res. 271, 133–146 (2011).

    Article  PubMed  Google Scholar 

  74. Dick, F. et al. In vivo functional and myeloarchitectonic mapping of human primary auditory areas. J. Neurosci. 32, 16095–16105 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Schebesch, G., Lingner, A., Firzlaff, U., Wiegrebe, L. & Grothe, B. Perception and neural representation of size-variant human vowels in the Mongolian gerbil (Meriones unguiculatus). Hear. Res. 261, 1–8 (2010).

    Article  PubMed  Google Scholar 

  76. Versnel, H. & Shamma, S. A. Spectral-ripple representation of steady-state vowels in primary auditory cortex. J. Acoust. Soc. Am. 103, 2502–2514 (1998).

    Article  CAS  PubMed  Google Scholar 

  77. Formisano, E., De Martino, F., Bonte, M. & Goebel, R. “Who” is saying “what”? Brain-based decoding of human voice and speech. Science 322, 970–973 (2008).

    Article  CAS  PubMed  Google Scholar 

  78. Bizley, J. K. & Walker, K. M. Distributed sensitivity to conspecific vocalizations and implications for the auditory dual stream hypothesis. J. Neurosci. 29, 3011–3013 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Walker, K. M., Bizley, J. K., King, A. J. & Schnupp, J. W. Multiplexed and robust representations of sound features in auditory cortex. J. Neurosci. 31, 14565–14576 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Bidelman, G. M., Moreno, S. & Alain, C. Tracing the emergence of categorical speech perception in the human auditory system. Neuroimage 79, 201–212 (2013).

    Article  PubMed  Google Scholar 

  81. Parker, A. J. & Newsome, W. T. Sense and the single neuron: probing the physiology of perception. Annu. Rev. Neurosci. 21, 227–277 (1998).

    Article  CAS  PubMed  Google Scholar 

  82. Nienborg, H., Cohen, M. R. & Cumming, B. G. Decision-related activity in sensory neurons: correlations among neurons and with behavior. Annu. Rev. Neurosci. 35, 463–483 (2012).

    Article  CAS  PubMed  Google Scholar 

  83. Gold, J. I. & Shadlen, M. N. The neural basis of decision making. Annu. Rev. Neurosci. 30, 535–574 (2007).

    Article  CAS  PubMed  Google Scholar 

  84. Schall, J. D. & Bichot, N. P. Neural correlates of visual and motor decision processes. Curr. Opin. Neurobiol. 8, 211–217 (1998).

    Article  CAS  PubMed  Google Scholar 

  85. Niwa, M., Johnson, J. S., O'Connor, K. N. & Sutter, M. L. Activity related to perceptual judgment and action in primary auditory cortex. J. Neurosci. 32, 3193–3210 (2012). The authors recorded single- and multiunit activity in the auditory cortex of animals performing an auditory modulation detection task. In addition to acoustic information, neural activity was informative about both motor actions and the animals' behavioural choice.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Kilian-Hutten, N., Valente, G., Vroomen, J. & Formisano, E. Auditory cortex encodes the perceptual interpretation of ambiguous sound. J. Neurosci. 31, 1715–1720 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Russ, B. E., Orr, L. E. & Cohen, Y. E. Prefrontal neurons predict choices during an auditory same-different task. Curr. Biol. 18, 1483–1488 (2008). The authors recorded from neurons in the ventrolateral prefrontal cortex of monkeys performing a non-spatial same–different task. Neural activity predicted animals' behavioural choices, demonstrating a direct link between single neurons and behavioural choice.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Tsunada, J., Lee, J. H. & Cohen, Y. E. Representation of speech categories in the primate auditory cortex. J. Neurophysiol. 105, 2634–2646 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  89. Russ, B. E., Ackelson, A. L., Baker, A. E. & Cohen, Y. E. Coding of auditory-stimulus identity in the auditory non-spatial processing stream. J. Neurophysiol. 99, 87–95 (2008).

    Article  PubMed  Google Scholar 

  90. Lemus, L., Hernandez, A. & Romo, R. Neural encoding of auditory discrimination in ventral premotor cortex. Proc. Natl Acad. Sci. USA 106, 14640–14645 (2009).

    Article  PubMed  Google Scholar 

  91. Lemus, L., Hernandez, A. & Romo, R. Neural codes for perceptual discrimination of acoustic flutter in the primate auditory cortex. Proc. Natl Acad. Sci. USA 106, 9471–9476 (2009).

    Article  PubMed  Google Scholar 

  92. Selezneva, E., Scheich, H. & Brosch, M. Dual time scales for categorical decision making in auditory cortex. Curr. Biol. 16, 2428–2433 (2006).

    Article  CAS  PubMed  Google Scholar 

  93. Gold, J. I. & Shadlen, M. N. Neural computations that underlie decisions about sensory stimuli. Trends Cogn. Sci. 5, 10–16 (2001).

    Article  PubMed  Google Scholar 

  94. Buffalo, E. A., Fries, P., Landman, R., Buschman, T. J. & Desimone, R. Laminar differences in gamma and alpha coherence in the ventral stream. Proc. Natl Acad. Sci. USA 108, 11262–11267 (2011).

    Article  PubMed  Google Scholar 

  95. Niwa, M., Johnson, J. S., O'Connor, K. N. & Sutter, M. L. Differences between primary auditory cortex and auditory belt related to encoding and choice for AM sounds. J. Neurosci. 33, 8378–8395 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Romo, R. & Salinas, E. Sensing and deciding in the somatosensory system. Curr. Opin. Neurobiol. 9, 487–493 (1999).

    Article  CAS  PubMed  Google Scholar 

  97. Riecke, L. et al. Hearing an illusory vowel in noise: suppression of auditory cortical activity. J. Neurosci. 32, 8024–8034 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Riecke, L., Mendelsohn, D., Schreiner, C. & Formisano, E. The continuity illusion adapts to the auditory scene. Hear. Res. 247, 71–77 (2009).

    Article  PubMed  Google Scholar 

  99. Riecke, L., Micheyl, C. & Oxenham, A. J. Global not local masker features govern the auditory continuity illusion. J. Neurosci. 32, 4660–4664 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Pressnitzer, D., Suied, C. & Shamma, S. A. Auditory scene analysis: the sweet music of ambiguity. Front. Hum. Neurosci. 5, 158 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  101. Leopold, D. A. & Logothetis, N. K. Multistable phenomena: changing views in perception. Trends Cogn. Sci. 3, 254–264 (1999).

    Article  CAS  PubMed  Google Scholar 

  102. Shamma, S. A. & Micheyl, C. Behind the scenes of auditory perception. Curr. Opin. Neurobiol. 20, 361–366 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Pressnitzer, D., Sayles, M., Micheyl, C. & Winter, I. M. Perceptual organization of sound begins in the auditory periphery. Curr. Biol. 18, 1124–1128 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Micheyl, C. et al. The role of auditory cortex in the formation of auditory streams. Hear. Res. 229, 116–131 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  105. Gutschalk, A., Micheyl, C. & Oxenham, A. J. Neural correlates of auditory perceptual awareness under informational masking. PLoS Biol. 6, e138 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Kondo, H. M. & Kashino, M. Involvement of the thalamocortical loop in the spontaneous switching of percepts in auditory streaming. J. Neurosci. 29, 12695–12701 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Deike, S., Gaschler-Markefski, B., Brechmann, A. & Scheich, H. Auditory stream segregation relying on timbre involves left auditory cortex. Neuroreport 15, 1511–1514 (2004).

    Article  PubMed  Google Scholar 

  108. Hill, K. T., Bishop, C. W., Yadav, D. & Miller, L. M. Pattern of BOLD signal in auditory cortex relates acoustic response to perceptual streaming. BMC Neurosci. 12, 85 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Micheyl, C., Tian, B., Carlyon, R. P. & Rauschecker, J. P. Perceptual organization of tone sequences in the auditory cortex of awake macaques. Neuron 48, 139–148 (2005).

    Article  CAS  PubMed  Google Scholar 

  110. Fishman, Y. I., Reser, D. H., Arezzo, J. C. & Steinschneider, M. Neural correlates of auditory stream segregation in primary auditory cortex of the awake monkey. Hear. Res. 151, 167–187 (2001). The authors present single-unit recordings in the auditory cortex in response to ABA tone sequences. Non-best frequency tones were suppressed at presentation rates and frequency separations in a manner that mirrored human perception.

    Article  CAS  PubMed  Google Scholar 

  111. Elhilali, M., Ma, L., Micheyl, C., Oxenham, A. J. & Shamma, S. A. Temporal coherence in the perceptual organization and cortical representation of auditory scenes. Neuron 61, 317–329 (2009). Using psychophysical methods, the authors demonstrate that spectral components that are well separated in frequency are no longer heard as separate streams if presented synchronously rather than consecutively. The authors present a 'temporal coherence' theory of auditory streaming.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. Micheyl, C., Kreft, H., Shamma, S. & Oxenham, A. J. Temporal coherence versus harmonicity in auditory stream formation. J. Acoust. Soc. Am. 133, EL188–EL194 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  113. Kashino, M. & Kondo, H. M. Functional brain networks underlying perceptual switching: auditory streaming and verbal transformations. Phil. Trans. R. Soc. B 367, 977–987 (2012).

    Article  PubMed  Google Scholar 

  114. Tsunada, J., Lee, J. H. & Cohen, Y. E. Differential representation of auditory categories between cell classes in primate auditory cortex. J. Physiol. 590, 3129–3139 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  115. Obleser, J., Leaver, A. M., Vanmeter, J. & Rauschecker, J. P. Segregation of vowels and consonants in human auditory cortex: evidence for distributed hierarchical organization. Front. Psychol. 1, 232 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  116. Chevillet, M. A., Jiang, X., Rauschecker, J. P. & Riesenhuber, M. Automatic phoneme category selectivity in the dorsal auditory stream. J. Neurosci. 33, 5208–5215 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Leaver, A. M. & Rauschecker, J. P. Cortical representation of natural complex sounds: effects of acoustic features and auditory object category. J. Neurosci. 30, 7604–7612 (2010). The authors used fMRI to investigate the hierarchical processing of natural sounds in the ventral pathway. Category-selective responses were identified in anterior superior temporal regions, whereas responses in the superior temporal sulcus were not category-selective but rather responded to acoustic features.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  118. Giordano, B. L., McAdams, S., Zatorre, R. J., Kriegeskorte, N. & Belin, P. Abstract encoding of auditory objects in cortical activity patterns. Cereb. Cortex 23, 2025–2037 (2013). The authors combined multivariate analyses of fMRI data with analysis of the low-level acoustical information to examine the abstract encoding of non-speech categories. They observed category sensitivity in the planum temporale, suggesting that object processing is not restricted to the ventral pathway.

    Article  PubMed  Google Scholar 

  119. Gifford, G. W., MacLean, K. A., Hauser, M. D. & Cohen, Y. E. The neurophysiology of functionally meaningful categories: macaque ventrolateral prefrontal cortex plays a critical role in spontaneous categorization of species-specific vocalizations. J. Cogn. Neurosci. 17, 1471–1482 (2005).

    Article  PubMed  Google Scholar 

  120. Ohl, F. W., Scheich, H. & Freeman, W. J. Change in pattern of ongoing cortical activity with auditory category learning. Nature 412, 733–736 (2001). The authors recorded from the auditory cortex of gerbils while the animals learned an acoustic classification task. They demonstrate that the stimulus representation in the auditory cortex undergoes a dramatic change in its dynamic pattern at the point when animals begin to correctly classify the acoustic stimuli.

    Article  CAS  PubMed  Google Scholar 

  121. Fritz, J. B., David, S. V., Radtke-Schuller, S., Yin, P. & Shamma, S. A. Adaptive, behaviorally gated, persistent encoding of task-relevant auditory information in ferret frontal cortex. Nature Neurosci. 13, 1011–1019 (2010).

    Article  CAS  PubMed  Google Scholar 

  122. King, A. J. & Nelken, I. Unraveling the principles of auditory cortical processing: can we learn from the visual system? Nature Neurosci. 12, 698–701 (2009).

    Article  CAS  PubMed  Google Scholar 

  123. Hegde, J. & Van Essen, D. C. Role of primate visual area V4 in the processing of 3D shape characteristics defined by disparity. J. Neurophysiol. 94, 2856–2866 (2005).

    Article  PubMed  Google Scholar 

  124. Alain, C. Breaking the wave: effects of attention and learning on concurrent sound perception. Hear. Res. 229, 225–236 (2007).

    Article  PubMed  Google Scholar 

  125. Naatanen, R. & Picton, T. The N1 wave of the human electric and magnetic response to sound: a review and an analysis of the component structure. Psychophysiology 24, 375–425 (1987).

    Article  CAS  PubMed  Google Scholar 

  126. Kujala, T., Tervaniemi, M. & Schroger, E. The mismatch negativity in cognitive and clinical neuroscience: theoretical and methodological considerations. Biol. Psychol. 74, 1–19 (2007).

    Article  PubMed  Google Scholar 

  127. Picton, T. W., Alain, C., Otten, L., Ritter, W. & Achim, A. Mismatch negativity: different water in the same river. Audiol. Neurootol. 5, 111–139 (2000).

    Article  CAS  PubMed  Google Scholar 

  128. Alain, C., Woods, D. L. & Ogawa, K. H. Brain indices of automatic pattern processing. Neuroreport 6, 140–144 (1994).

    Article  CAS  PubMed  Google Scholar 

  129. Sussman, E. S., Horvath, J., Winkler, I. & Orr, M. The role of attention in the formation of auditory streams. Percept. Psychophys. 69, 136–152 (2007).

    Article  PubMed  Google Scholar 

  130. Cusack, R., Deeks, J., Aikman, G. & Carlyon, R. P. Effects of location, frequency region, and time course of selective attention on auditory scene analysis. J. Exp. Psychol. Hum. Percept. Perform. 30, 643–656 (2004).

    Article  PubMed  Google Scholar 

  131. Winkler, I., Takegata, R. & Sussman, E. Event-related brain potentials reveal multiple stages in the perceptual organization of sound. Brain Res. Cogn. Brain Res. 25, 291–299 (2005).

    Article  PubMed  Google Scholar 

  132. Snyder, J. S., Alain, C. & Picton, T. W. Effects of attention on neuroelectric correlates of auditory stream segregation. J. Cogn. Neurosci. 18, 1–13 (2006).

    Article  PubMed  Google Scholar 

  133. Snyder, J. S., Carter, O. L., Hannon, E. E. & Alain, C. Adaptation reveals multiple levels of representation in auditory stream segregation. J. Exp. Psychol. Hum. Percept. Perform. 35, 1232–1244 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  134. Knudsen, E. I. Fundamental components of attention. Annu. Rev. Neurosci. 30, 57–78 (2007).

    Article  CAS  PubMed  Google Scholar 

  135. Shinn-Cunningham, B. G. & Best, V. Selective attention in normal and impaired hearing. Trends Amplif. 12, 283–299 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  136. Desimone, R. & Duncan, J. Neural mechanisms of selective visual attention. Annu. Rev. Neurosci. 18, 193–222 (1995).

    Article  CAS  PubMed  Google Scholar 

  137. Zatorre, R. J., Mondor, T. A. & Evans, A. C. Auditory attention to space and frequency activates similar cerebral systems. Neuroimage 10, 544–554 (1999).

    Article  CAS  PubMed  Google Scholar 

  138. Duncan, J. E.P. S. Mid-Career Award 2004: brain mechanisms of attention. Q. J. Exp. Psychol. 59, 2–27 (2006).

    Article  Google Scholar 

  139. Lee, A. K. & Shinn-Cunningham, B. G. Effects of reverberant spatial cues on attention-dependent object formation. J. Assoc. Res. Otolaryngol. 9, 150–160 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  140. Darwin, C. J. & Hukin, R. W. Perceptual segregation of a harmonic from a vowel by interaural time difference in conjunction with mistuning and onset asynchrony. J. Acoust. Soc. Am. 103, 1080–1084 (1998).

    Article  CAS  PubMed  Google Scholar 

  141. Best, V., Gallun, F. J., Carlile, S. & Shinn-Cunningham, B. G. Binaural interference and auditory grouping. J. Acoust. Soc. Am. 121, 1070–1076 (2007).

    Article  PubMed  Google Scholar 

  142. Shinn-Cunningham, B. G., Lee, A. K. & Oxenham, A. J. A sound element gets lost in perceptual competition. Proc. Natl Acad. Sci. USA 104, 12223–12227 (2007).

    Article  CAS  PubMed  Google Scholar 

  143. Kastner, S. & Ungerleider, L. G. Mechanisms of visual attention in the human cortex. Annu. Rev. Neurosci. 23, 315–341 (2000).

    Article  CAS  PubMed  Google Scholar 

  144. Shamma, S. On the emergence and awareness of auditory objects. PLoS Biol. 6, e155 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  145. Dick, F., Lee, H. L., Nusbaum, H. & Price, C. J. Auditory-motor expertise alters “speech selectivity” in professional musicians and actors. Cereb. Cortex 21, 938–948 (2011).

    Article  PubMed  Google Scholar 

  146. Fritz, J., Shamma, S., Elhilali, M. & Klein, D. Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex. Nature Neurosci. 6, 1216–1223 (2003).

    Article  CAS  PubMed  Google Scholar 

  147. Atiani, S., Elhilali, M., David, S. V., Fritz, J. B. & Shamma, S. A. Task difficulty and performance induce diverse adaptive patterns in gain and shape of primary auditory cortical receptive fields. Neuron 61, 467–480 (2009).

    Article  CAS  PubMed  Google Scholar 

  148. Niwa, M., Johnson, J. S., O'Connor, K. N. & Sutter, M. L. Active engagement improves primary auditory cortical neurons' ability to discriminate temporal modulation. J. Neurosci. 32, 9323–9334 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  149. Lee, C. C. & Middlebrooks, J. C. Auditory cortex spatial sensitivity sharpens during task performance. Nature Neurosci. 14, 108–114 (2011).

    Article  CAS  PubMed  Google Scholar 

  150. Alain, C. & Woods, D. L. Attention modulates auditory pattern memory as indexed by event-related brain potentials. Psychophysiology 34, 534–546 (1997).

    Article  CAS  PubMed  Google Scholar 

  151. Woods, D. L., Alho, K. & Algazi, A. Intermodal selective attention: evidence for processing in tonotopic auditory fields. Psychophysiology 30, 287–295 (1993).

    Article  CAS  PubMed  Google Scholar 

  152. Woods, D. L., Alho, K. & Algazi, A. Intermodal selective attention. I. Effects on event-related potentials to lateralized auditory and visual stimuli. Electroencephalogr. Clin. Neurophysiol. 82, 341–355 (1992).

    Article  CAS  PubMed  Google Scholar 

  153. Petkov, C. I. et al. Attentional modulation of human auditory cortex. Nature Neurosci. 7, 658–663 (2004).

    Article  CAS  PubMed  Google Scholar 

  154. Rinne, T. et al. Attention modulates sound processing in human auditory cortex but not the inferior colliculus. Neuroreport 18, 1311–1314 (2007).

    Article  PubMed  Google Scholar 

  155. Woldorff, M. G. et al. Modulation of early sensory processing in human auditory cortex during auditory selective attention. Proc. Natl Acad. Sci. USA 90, 8722–8726 (1993).

    Article  CAS  PubMed  Google Scholar 

  156. Ding, N. & Simon, J. Z. Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. J. Neurophysiol. 107, 78–89 (2012).

    Article  PubMed  Google Scholar 

  157. Mesgarani, N. & Chang, E. F. Selective cortical representation of attended speaker in multi-talker speech perception. Nature 485, 233–236 (2012). The authors used electrocorticographic recording in human patients to investigate neural activity in listeners selectively attending to one stream of speech while ignoring a distractor stream. Neural activity represented crucial features of the attended speech while apparently suppressing the unattended stream.

    Article  CAS  PubMed  Google Scholar 

  158. Degerman, A., Rinne, T., Salmi, J., Salonen, O. & Alho, K. Selective attention to sound location or pitch studied with fMRI. Brain Res. 1077, 123–134 (2006).

    Article  CAS  PubMed  Google Scholar 

  159. Salmi, J., Rinne, T., Degerman, A. & Alho, K. Orienting and maintenance of spatial attention in audition and vision: an event-related brain potential study. Eur. J. Neurosci. 25, 3725–3733 (2007).

    Article  PubMed  Google Scholar 

  160. Ahveninen, J. et al. Task-modulated “what” and “where” pathways in human auditory cortex. Proc. Natl Acad. Sci. USA 103, 14608–14613 (2006).

    Article  CAS  PubMed  Google Scholar 

  161. Buffalo, E. A., Fries, P., Landman, R., Liang, H. & Desimone, R. A backward progression of attentional effects in the ventral stream. Proc. Natl Acad. Sci. USA 107, 361–365 (2010).

    Article  PubMed  Google Scholar 

  162. Sugihara, T., Diltz, M. D., Averbeck, B. B. & Romanski, L. M. Integration of auditory and visual communication information in the primate ventrolateral prefrontal cortex. J. Neurosci. 26, 11138–11147 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  163. Romanski, L. M., Averbeck, B. B. & Diltz, M. Neural representation of vocalizations in the primate ventrolateral prefrontal cortex. J. Neurophysiol. 93, 734–747 (2005).

    Article  PubMed  Google Scholar 

  164. Gifford, G. W., Hauser, M. D. & Cohen, Y. E. Discrimination of functionally referential calls by laboratory-housed rhesus macaques: implications for neuroethological studies. Brain Behav. Evol. 61, 213–224 (2003).

    Article  Google Scholar 

  165. Teki, S. et al. Navigating the auditory scene: an expert role for the hippocampus. J. Neurosci. 32, 12251–12257 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  166. Culling, J. F. & Summerfield, Q. Perceptual separation of concurrent speech sounds: absence of across-frequency grouping by common interaural delay. J. Acoust. Soc. Am. 98, 785–797 (1995).

    Article  CAS  PubMed  Google Scholar 

  167. Darwin, C. J. & Hukin, R. W. Perceptual segregation of a harmonic from a vowel by interaural time difference and frequency proximity. J. Acoust. Soc. Am. 102, 2316–2324 (1997).

    Article  CAS  PubMed  Google Scholar 

  168. McAdams, S. & Bregman, A. S. Hearing musical streams. Computer Music J. 3, 26–43 (1979).

    Google Scholar 

  169. Shamma, S. A., Elhilali, M. & Micheyl, C. Temporal coherence and attention in auditory scene analysis. Trends Neurosci. 34, 114–123 (2011).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank H. Hersh for a critical reading of the manuscript. J.K.B. is supported by a Royal Society Dorothy Hodgkin Research Fellowship and BBSRC grant BB/H016813/1. Y.E.C. is supported by grants from the US National Institute on Deafness and Other Communication Disorders and US National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jennifer K. Bizley or Yale E. Cohen.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

PowerPoint slides

Glossary

Pitch

The attribute of a sound that enables it to be ordered from high to low on a musical scale. The perceived pitch for a periodic sound is determined by its fundamental frequency (F0), usually the lowest frequency component.

Timbre

The quality of a sound that is determined by its spectral or temporal envelope. Timbre allows a listener to differentiate between a violin and a banjo despite the fact that the two instruments may be producing a sound that has the same pitch.

Harmonicity

A harmonic sound contains frequency components at integer multiples of the fundamental frequency (see the definition for 'pitch'). Many vocalizations and other pitch-evoking sounds have a harmonic structure.

Spectral envelope

This term refers to the distribution of power across frequency in a sound. For a harmonic sound, this equates to the relative power across harmonics.

Dynamic causal modelling

A computational approach that performs Bayesian model comparisons in order to infer the organizational structure of processing within different brain regions.

Auditory flutter

The sensation produced by a periodic stimulus in which a listener can hear the sound as being intermittent. At higher frequencies, the sound is fused into one with a continuous melodic pitch. The border between being heard as intermittent or continuous is the flicker–fusion limit.

Forward masking

A process by which a sound is obscured by a masker (for example, a noise burst) that precedes the sound.

Categorical perception

The experience of perceiving a stimulus as being the same (that is, invariant) despite the fact that the physical properties of the stimulus have changed smoothly along a specific axis or continuum. A characteristic of categorical perception is that for a continuously changing stimulus dimension, subjects generalize across changes, with a sharp change in the perception from one class to another at the position of the boundary of the stimulus identity.

Scene analysis

The process by which the brain organizes and segregates acoustic stimuli into meaningful elements or objects.

Grandmother cells

Hypothetical cells that represent a very specific complex object or concept — such as one's grandmother.

Object-related negativity

An evoked-potential component that is elicited when two concurrently presented sounds are perceived as originating from different sources based on simultaneous grouping cues.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bizley, J., Cohen, Y. The what, where and how of auditory-object perception. Nat Rev Neurosci 14, 693–707 (2013). https://doi.org/10.1038/nrn3565

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrn3565

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing