To derive meaning from sound, the brain must integrate information across many timescales. What computations underlie multiscale integration in human auditory cortex? Evidence suggests that auditory cortex analyses sound using both generic acoustic representations (for example, spectrotemporal modulation tuning) and category-specific computations, but the timescales over which these putatively distinct computations integrate remain unclear. To answer this question, we developed a general method to estimate sensory integration windows—the time window when stimuli alter the neural response—and applied our method to intracranial recordings from neurosurgical patients. We show that human auditory cortex integrates hierarchically across diverse timescales spanning from ~50 to 400 ms. Moreover, we find that neural populations with short and long integration windows exhibit distinct functional properties: short-integration electrodes (less than ~200 ms) show prominent spectrotemporal modulation selectivity, while long-integration electrodes (greater than ~200 ms) show prominent category selectivity. These findings reveal how multiscale integration organizes auditory computation in the human brain.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
The importance of temporal-fine structure to perceive time-compressed speech with and without the restoration of the syllabic rhythm
Scientific Reports Open Access 18 February 2023
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 per month
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Get just this article for as long as you need it
Prices may be subject to local taxes which are calculated during checkout
Source data are also provided with this paper. The data supporting the findings of this study are available from the corresponding author upon request. Data are shared upon request due to the sensitive nature of human patient data. The TCI stimuli and the Source data underlying key statistics and figures (Figs. 4 and 5) are available at this repository: https://github.com/snormanhaignere/NHB-TCI-source-data.
Code implementing the TCI analyses described in this paper is available at: https://github.com/snormanhaignere/TCI
Brodbeck, C., Hong, L. E. & Simon, J. Z. Rapid transformation from auditory to linguistic representations of continuous speech. Curr. Biol. 28, 3976–3983 (2018).
DeWitt, I. & Rauschecker, J. P. Phoneme and word recognition in the auditory ventral stream. Proc. Natl Acad. Sci. USA 109, E505–E514 (2012).
Hickok, G. & Poeppel, D. The cortical organization of speech processing. Nat. Rev. Neurosci. 8, 393–402 (2007).
Santoro, R. et al. Encoding of natural sounds at multiple spectral and temporal resolutions in the human auditory cortex. PLoS Comput. Biol. 10, e1003412 (2014).
Hullett, P. W., Hamilton, L. S., Mesgarani, N., Schreiner, C. E. & Chang, E. F. Human superior temporal gyrus organization of spectrotemporal modulation tuning derived from speech stimuli. J. Neurosci. 36, 2014–2026 (2016).
Schönwiesner, M. & Zatorre, R. J. Spectro-temporal modulation transfer function of single voxels in the human auditory cortex measured with high-resolution fMRI. Proc. Natl Acad. Sci. USA 106, 14611–14616 (2009).
Barton, B., Venezia, J. H., Saberi, K., Hickok, G. & Brewer, A. A. Orthogonal acoustic dimensions define auditory field maps in human cortex. Proc. Natl Acad. Sci. USA 109, 20738–20743 (2012).
Leaver, A. M. & Rauschecker, J. P. Cortical representation of natural complex sounds: effects of acoustic features and auditory object category. J. Neurosci. 30, 7604–7612 (2010).
Norman-Haignere, S. V., Kanwisher, N. G. & McDermott, J. H. Distinct cortical pathways for music and speech revealed by hypothesis-free voxel decomposition. Neuron 88, 1281–1296 (2015).
Kell, A. J., Yamins, D. L., Shook, E. N., Norman-Haignere, S. V. & McDermott, J. H. A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron 98, 630–644 (2018).
Overath, T., McDermott, J. H., Zarate, J. M. & Poeppel, D. The cortical analysis of speech-specific temporal structure revealed by responses to sound quilts. Nat. Neurosci. 18, 903–911 (2015).
Davis, M. H. & Johnsrude, I. S. Hierarchical processing in spoken language comprehension. J. Neurosci. 23, 3423–3431 (2003).
Belin, P., Zatorre, R. J., Lafaille, P., Ahad, P. & Pike, B. Voice-selective areas in human auditory cortex. Nature 403, 309–312 (2000).
Zuk, N. J., Teoh, E. S. & Lalor, E. C. EEG-based classification of natural sounds reveals specialized responses to speech and music. NeuroImage 210, 116558 (2020).
Di Liberto, G. M., O’Sullivan, J. A. & Lalor, E. C. Low-frequency cortical entrainment to speech reflects phoneme-level processing. Curr. Biol. 25, 2457–2465 (2015).
Ding, N. et al. Temporal modulations in speech and music. Neurosci. Biobehav. Rev. 81, 181–187 (2017).
Elhilali, M. in Timbre: Acoustics, Perception, and Cognition (eds Siedenburg, K. et al.) 335–359 (Springer, 2019).
Patel, A. D. Music, Language, and the Brain (Oxford Univ. Press, 2007).
Norman-Haignere, S. V. & McDermott, J. H. Neural responses to natural and model-matched stimuli reveal distinct computations in primary and nonprimary auditory cortex. PLoS Biol. 16, e2005127 (2018).
Theunissen, F. & Miller, J. P. Temporal encoding in nervous systems: a rigorous definition. J. Comput. Neurosci. 2, 149–162 (1995).
Lerner, Y., Honey, C. J., Silbert, L. J. & Hasson, U. Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. J. Neurosci. 31, 2906–2915 (2011).
Chen, C., Read, H. L. & Escabí, M. A. Precise feature based time scales and frequency decorrelation lead to a sparse auditory code. J. Neurosci. 32, 8454–8468 (2012).
Meyer, A. F., Williamson, R. S., Linden, J. F. & Sahani, M. Models of neuronal stimulus-response functions: elaboration, estimation, and evaluation. Front. Syst. Neurosci. 10, 109 (2017).
Khatami, F. & Escabí, M. A. Spiking network optimized for word recognition in noise predicts auditory system hierarchy. PLoS Comput. Biol. 16, e1007558 (2020).
Harper, N. S. et al. Network receptive field modeling reveals extensive integration and multi-feature selectivity in auditory cortical neurons. PLoS Comput. Biol. 12, e1005113 (2016).
Keshishian, M. et al. Estimating and interpreting nonlinear receptive field of sensory neural responses with deep neural network models. eLife 9, e53445 (2020).
Albouy, P., Benjamin, L., Morillon, B. & Zatorre, R. J. Distinct sensitivity to spectrotemporal modulation supports brain asymmetry for speech and melody. Science 367, 1043–1047 (2020).
Flinker, A., Doyle, W. K., Mehta, A. D., Devinsky, O. & Poeppel, D. Spectrotemporal modulation provides a unifying framework for auditory cortical asymmetries. Nat. Hum. Behav. 3, 393–405 (2019).
Teng, X. & Poeppel, D. Theta and Gamma bands encode acoustic dynamics over wide-ranging timescales. Cereb. Cortex 30, 2600–2614 (2020).
Obleser, J., Eisner, F. & Kotz, S. A. Bilateral speech comprehension reflects differential sensitivity to spectral and temporal features. J. Neurosci. 28, 8116–8123 (2008).
Baumann, S. et al. The topography of frequency and time representation in primate auditory cortices. eLife 4, e03256 (2015).
Rogalsky, C., Rong, F., Saberi, K. & Hickok, G. Functional anatomy of language and music perception: temporal and structural factors investigated using functional magnetic resonance imaging. J. Neurosci. 31, 3843–3852 (2011).
Farbood, M. M., Heeger, D. J., Marcus, G., Hasson, U. & Lerner, Y. The neural processing of hierarchical structure in music and speech at different timescales. Front. Neurosci. 9, 157 (2015).
Angeloni, C. & Geffen, M. N. Contextual modulation of sound processing in the auditory cortex. Curr. Opin. Neurobiol. 49, 8–15 (2018).
Griffiths, T. D. et al. Direct recordings of pitch responses from human auditory cortex. Curr. Biol. 20, 1128–1132 (2010).
Mesgarani, N., Cheung, C., Johnson, K. & Chang, E. F. Phonetic feature encoding in human superior temporal gyrus. Science 343, 1006–1010 (2014).
Ray, S. & Maunsell, J. H. R. Different origins of gamma rhythm and high-gamma activity in macaque visual cortex. PLoS Biol. 9, e1000610 (2011).
Manning, J. R., Jacobs, J., Fried, I. & Kahana, M. J. Broadband shifts in local field potential power spectra are correlated with single-neuron spiking in humans. J. Neurosci. 29, 13613–13620 (2009).
Slaney, M. Auditory toolbox. Interval Res. Corporation, Tech. Rep. 10, 1998 (1998).
Chi, T., Ru, P. & Shamma, S. A. Multiresolution spectrotemporal analysis of complex sounds. J. Acoust. Soc. Am. 118, 887–906 (2005).
Singh, N. C. & Theunissen, F. E. Modulation spectra of natural sounds and ethological theories of auditory processing. J. Acoust. Soc. Am. 114, 3394–3411 (2003).
Di Liberto, G. M., Wong, D., Melnik, G. A. & de Cheveigné, A. Low-frequency cortical responses to natural speech reflect probabilistic phonotactics. Neuroimage 196, 237–247 (2019).
Leonard, M. K., Bouchard, K. E., Tang, C. & Chang, E. F. Dynamic encoding of speech sequence probability in human temporal cortex. J. Neurosci. 35, 7203–7214 (2015).
Schoppe, O., Harper, N. S., Willmore, B. D., King, A. J. & Schnupp, J. W. Measuring the performance of neural models. Front. Comput. Neurosci. 10, 10 (2016).
Mizrahi, A., Shalev, A. & Nelken, I. Single neuron and population coding of natural sounds in auditory cortex. Curr. Opin. Neurobiol. 24, 103–110 (2014).
Chien, H.-Y. S. & Honey, C. J. Constructing and forgetting temporal context in the human cerebral cortex. Neuron 106, 675–686 (2020).
Panzeri, S., Brunel, N., Logothetis, N. K. & Kayser, C. Sensory neural codes using multiplexed temporal scales. Trends Neurosci. 33, 111–120 (2010).
Joris, P. X., Schreiner, C. E. & Rees, A. Neural processing of amplitude-modulated sounds. Physiol. Rev. 84, 541–577 (2004).
Wang, X., Lu, T., Bendor, D. & Bartlett, E. Neural coding of temporal information in auditory thalamus and cortex. Neuroscience 154, 294–303 (2008).
Gao, X. & Wehr, M. A coding transformation for temporally structured sounds within auditory cortical neurons. Neuron 86, 292–303 (2015).
McDermott, J. H. & Simoncelli, E. P. Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis. Neuron 71, 926–940 (2011).
Cohen, M. R. & Kohn, A. Measuring and interpreting neuronal correlations. Nat. Neurosci. 14, 811–819 (2011).
Murray, J. D. et al. A hierarchy of intrinsic timescales across primate cortex. Nat. Neurosci. 17, 1661–1663 (2014).
Chaudhuri, R., Knoblauch, K., Gariel, M.-A., Kennedy, H. & Wang, X.-J. A large-scale circuit mechanism for hierarchical dynamical processing in the primate cortex. Neuron 88, 419–431 (2015).
Rauschecker, J. P. & Scott, S. K. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat. Neurosci. 12, 718–724 (2009).
Sharpee, T. O., Atencio, C. A. & Schreiner, C. E. Hierarchical representations in the auditory cortex. Curr. Opin. Neurobiol. 21, 761–767 (2011).
Zatorre, R. J., Belin, P. & Penhune, V. B. Structure and function of auditory cortex: music and speech. Trends Cogn. Sci. 6, 37–46 (2002).
Poeppel, D. The analysis of speech in different temporal integration windows: cerebral lateralization as ‘asymmetric sampling in time’. Speech Commun. 41, 245–255 (2003).
Hamilton, L. S., Oganian, Y., Hall, J. & Chang, E. F. Parallel and distributed encoding of speech across human auditory cortex. Cell 184, 4626–4639 (2021).
Nourski, K. V. et al. Functional organization of human auditory cortex: investigation of response latencies through direct recordings. NeuroImage 101, 598–609 (2014).
Bartlett, E. L. The organization and physiology of the auditory thalamus and its role in processing acoustic features important for speech perception. Brain Lang. 126, 29–48 (2013).
Gattass, R., Gross, C. G. & Sandell, J. H. Visual topography of V2 in the macaque. J. Comp. Neurol. 201, 519–539 (1981).
Dumoulin, S. O. & Wandell, B. A. Population receptive field estimates in human visual cortex. Neuroimage 39, 647–660 (2008).
Ding, N., Melloni, L., Zhang, H., Tian, X. & Poeppel, D. Cortical tracking of hierarchical linguistic structures in connected speech. Nat. Neurosci. 19, 158–164 (2016).
Suied, C., Agus, T. R., Thorpe, S. J., Mesgarani, N. & Pressnitzer, D. Auditory gist: recognition of very short sounds from timbre cues. J. Acoust. Soc. Am. 135, 1380–1391 (2014).
Donhauser, P. W. & Baillet, S. Two distinct neural timescales for predictive speech processing. Neuron 105, 385–393 (2020).
Ulanovsky, N., Las, L., Farkas, D. & Nelken, I. Multiple time scales of adaptation in auditory cortex neurons. J. Neurosci. 24, 10440–10453 (2004).
Lu, K. et al. Implicit memory for complex sounds in higher auditory cortex of the ferret. J. Neurosci. 38, 9955–9966 (2018).
Chew, S. J., Mello, C., Nottebohm, F., Jarvis, E. & Vicario, D. S. Decrements in auditory responses to a repeated conspecific song are long-lasting and require two periods of protein synthesis in the songbird forebrain. Proc. Natl Acad. Sci. USA 92, 3406–3410 (1995).
Bianco, R. et al. Long-term implicit memory for sequential auditory patterns in humans. eLife 9, e56073 (2020).
Miller, K. J., Honey, C. J., Hermes, D., Rao, R. P. & Ojemann, J. G. Broadband changes in the cortical surface potential track activation of functionally diverse neuronal populations. Neuroimage 85, 711–720 (2014).
Leszczyński, M. et al. Dissociation of broadband high-frequency activity and neuronal firing in the neocortex. Sci. Adv. 6, eabb0977 (2020).
Günel, B., Thiel, C. M. & Hildebrandt, K. J. Effects of exogenous auditory attention on temporal and spectral resolution. Front. Psychol. 9, 1984 (2018).
Norman-Haignere, S. V. et al. Pitch-responsive cortical regions in congenital amusia. J. Neurosci. 36, 2986–2994 (2016).
Norman-Haignere, S. et al. Intracranial recordings from human auditory cortex reveal a neural population selective for musical song. Preprint at bioRxiv https://doi.org/10.1101/696161 (2020).
Boebinger, D., Norman-Haignere, S. V., McDermott, J. H. & Kanwisher, N. Music-selective neural populations arise without musical training. J. Neurophysiol. 125, 2237–2263 (2021).
Morosan, P. et al. Human primary auditory cortex: cytoarchitectonic subdivisions and mapping into a spatial reference system. Neuroimage 13, 684–701 (2001).
Baumann, S., Petkov, C. I. & Griffiths, T. D. A unified framework for the organization of the primate auditory cortex. Front. Syst. Neurosci. 7, 11 (2013).
Barr, D. J., Levy, R., Scheepers, C. & Tily, H. J. Random effects structure for confirmatory hypothesis testing: keep it maximal. J. Mem. Lang. 68, 255–278 (2013).
Kuznetsova, A., Brockhoff, P. B. & Christensen, R. H. lmerTest package: tests in linear mixed effects models. J. Stat. Softw. 82, 1–26 (2017).
Gelman, A. & Hill, J. Data Analysis Using Regression and Multilevel/Hierarchical Models (Cambridge Univ. Press, 2006).
Schielzeth, H. et al. Robustness of linear mixed-effects models to violations of distributional assumptions. Methods Ecol. Evol. 11, 1141–1152 (2020).
de Cheveigné, A. & Parra, L. C. Joint decorrelation, a versatile tool for multichannel data analysis. Neuroimage 98, 487–505 (2014).
Murphy, K. P. Machine Learning: A Probabilistic Perspective (MIT Press, 2012).
de Heer, W. A., Huth, A. G., Griffiths, T. L., Gallant, J. L. & Theunissen, F. E. The hierarchical cortical organization of human speech processing. J. Neurosci. 37, 6539–6557 (2017).
Marquardt, D. W. An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11, 431–441 (1963).
Fisher, W. M. tsylb: NIST syllabification software, version 2 revised (1997).
We thank D. Maksumov, N. Agrawal, S. Montenegro, L. Yu, M. Leszczynski and I. Tal for help with data collection, S. Montenegro and H. Wang for help in localizing electrodes and A. Kell, S. David, J. McDermott, B. Conway, N. Kanwisher, N. Kriegeskorte and M. Leszczynski for comments on an earlier draft of this manuscript. This study was supported by the National Institutes of Health (NIDCD-K99-DC018051 to S.V.N.-H., NIDCD-R01-DC014279 to N.M., S10 OD018211 to N.M., NINDS-R01-NS084142 to C.A.S. and NIDCD-R01-DC018805 to N.M./A.F.) and the Howard Hughes Medical Institute (LSRF postdoctoral award to S.V.N.-H.). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
The authors declare no competing interests.
Peer review information
Nature Human Behaviour thanks Jérémy Giroud, Jonas Obleser, Benjamin Morillon and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended Data Fig. 1 Histogram of phoneme, syllable, and word durations in TIMIT.
Durations of phonemes, multi-phoneme syllables, and multi-syllable words in the commonly used TIMIT database. Phonemes and words are labeled in the database. Syllables were computed from the phoneme labels using the software tsylb287. The median duration for each structure is 64, 197, and 479 milliseconds, respectively.
Extended Data Fig. 2 Cross-context correlation for 20 representative electrodes.
Electrodes were selected to illustrate the diversity of integration windows. Specifically, we partitioned all sound-responsive electrodes into 5 groups based on the width of their integration window, estimated using a model (Fig. 3 illustrates the model). For each group, we plot the four electrodes with the highest SNR (as measured by the test-retest correlation across the sound set). Electrodes have been sorted by their integration width, which is indicated to the right of each plot, along with the location, hemisphere and subject number for each electrode. Each plot shows the cross-context correlation and noise ceiling for a single electrode and segment duration (indicated above each column). There were more segments for the shorter durations, and as a consequence, the cross-context correlation and noise ceiling were more stable/reliable for shorter segments (the number of segments was inversely proportional to the duration). This property is useful because at the short segment durations, there are a smaller number of relevant time lags, and it is useful if those lags are more reliable. The model used to estimate integration windows pooled across all lags and segment durations, taking into account the reliability of each datapoint.
Extended Data Fig. 3 Simulation results.
a, Integration windows estimated from four different model responses (from top to bottom): (1) a model that integrated waveform magnitudes within a known window (2) a model that integrated energy within a cochlear frequency band (3) a model that integrated spectrotemporal energy in a cochleagram representation of sound (4) a simple, deep neural network. All models had a ground truth, Gamma-distributed integration window. We independently varied the width and centre of the integration window (excluding non-causal combinations) and tested if we could infer the ground truth values. Results are shown for several different SNRs, as measured by the test-retest correlation of the response across repetitions, the same metric used to select electrodes (we selected electrodes with a test-retest correlation greater than 0.1). Black dots correspond to a single model window/simulation. Red dots show the median estimate across all windows/simulations. Some models included more variants (for example different spectrotemporal filters), which is why some plots have a higher dot density. There is a small upward bias for very narrow integration widths (31 ms), probably due to the effects of the filter used to measure broadband gamma, which has an integration width of ~19 milliseconds. The integration widths of our electrodes (~50 to 400 ms) were mostly above the point at which this bias would have a substantial effect, and the bias works against our observed results since it compresses the possible range of integration widths. b, Integration windows estimated without explicitly modeling and accounting for boundary effects. Results are shown for the spectrotemporal model, which produces strong responses at the boundary between two segments due to prominent spectrotemporal changes. Note there is a nontrivial upward bias, particularly for integration widths, when not accounting for boundary effects (see Methods for a more detailed discussion). c, Integration windows estimated without accounting for an upward bias in the squared error loss. The bias grows as the SNR decreases (see Methods for an explanation). Results are shown for the waveform amplitude model, but the bias is present for all models since it is caused by the loss. Our bias-corrected loss largely corrected the problem, as can be observed in panel a.
Extended Data Fig. 4 Integration windows for different electrode types and subjects.
a, This panel plots integration widths (left) and centres (right) for individual electrodes as a function of distance to primary auditory cortex, defined as posteromedial Heschl’s gyrus. The electrodes have been labeled by their type (grid, depth, strip). The grid/strip electrodes were located further from primary auditory cortex on average, but given their location did not show any obvious difference in integration properties. The effect of distance was significant for the depth electrodes alone (the most numerous type of electrode) when excluding grids and strips (width: F1,14.53 = 24.51, p < 0.001, βdistance = 0.065 octaves/mm, CI = [0.039,0.090]; centre: F1,12.83 = 27.76, p < 0.001, βdistance = 0.052 octaves/mm, CI = [0.032,0.071], N = 114 electrodes). To be conservative, electrode type was included as a covariate in the linear mixed effects model used to assess significance as a whole. b, Same as panel a but indicating subject membership instead of electrode type. Each symbol corresponds to a unique subject. The effect of distance on integration windows is broadly distributed across the 18 subjects.
Extended Data Fig. 5 Robustness analyses.
a, Sound segments were excerpted from 10 sounds. This panel shows integration windows estimated using segments drawn from two non-overlapping splits of 5 sounds each (listed on the left). Since many non-primary regions only respond strongly to speech or music8,9,11, we included speech and music in both splits. Format is analogous to Fig. 4 but only showing integration widths (integration centres were also similar across analysis variants). The effect of distance was significant for both splits (split1: F1,12.660 = 40.20, p < 0.001, βdistance = 0.069 octaves/mm, CI = [0.047,0.090], N = 136 electrodes; split 2: F1,21.66 = 30.11, p < 0.001, βdistance = 0.066 octaves/mm, CI = [0.043,0.090], N = 135 electrodes). b, Shorter segments were created by subdividing longer segments, which made it possible to consider two types of context (see schematic): (1) random context, in which each segment is surrounded by random other segments (2) natural context, where a segment is a subset of a longer segment and thus surrounded by its natural context. When comparing responses across contexts, one of the two contexts must be random so that the contexts differ, but the other context can be random or natural. Our main analyses pooled across both types of comparison. Here, we show integration widths estimated by comparing either purely random contexts (top panel) or comparing random and natural contexts (bottom panel). The effect of distance was significant for both types of context comparisons (random-random: F1,28.056 = 30.01, p < 0.001, βdistance = 0.064 octaves/mm, CI = [0.041,0.087], N = 121 electrodes; random-natural: F1,18.816 = 27.087, p < 0.001, βdistance = 0.062 octaves/mm, CI = [0.039,0.086], N = 154 electrodes). c, We modeled integration windows using window shapes that varied from more exponential to more Gaussian (the parameter γ in equations 2 and 3 controls the shape of the window, see Methods). For our main analysis, we selected the shape that yielded the best prediction for each electrode. This panel shows integration widths estimated using two different fixed shapes. The effect of distance was significant for both shapes (γ = 1: F1,21.712 = 24.85, p < 0.001, βdistance = 0.067 octaves/mm, CI = [0.040,0.093], N = 154 electrodes; γ = 4: F1,20.973 = 19.38, p < 0.001, βdistance = 0.055 octaves/mm, CI = [0.031,0.080], N = 154 electrodes). d, Similar results were obtained using two different frequency ranges to measure gamma power (70–100 s Hz: F1,21.05 = 19.38, p < 0.001, βdistance = 0.058 octaves/mm, CI = [0.032,0.083], N = 133 electrodes; 100–140 Hz: F1,20.56 = 12.57, p < 0.01, βdistance = 0.051 octaves/mm, CI = [0.023,0.080], N = 131 electrodes).
Extended Data Fig. 6 Relationship between integration widths and centres without any causality constraint.
Extended Data Fig. 7 Components most selective for sound categories at different integration widths.
Electrodes were subdivided into three equally sized groups based on the width of their integration window. The time-averaged response of each electrode was then projected onto the top 2 components that showed the greatest category selectivity, measured using linear discriminant analysis (each circle corresponds to a unique sound). Same format as Fig. 5b, which plots responses projected onto the top 2 principal components. Half of the sounds were used to compute the components, and the other half were used to measure their response to avoid statistical circularity. As a consequence, there are half as many sounds as in Fig. 5b.
Extended Data Fig. 8 Results for integration-matched responses.
a, For our functional selectivity analyses, we subdivided the electrodes into three equally sized groups, based on the width of their integration window. To test if our results were an inevitable consequence of differences in temporal integration, we matched the integration windows across the electrodes in each group. Matching was performed by integrating the responses from the electrodes in the short and intermediate groups within an appropriately chosen window, such that the resulting integration window matched those for the longest group (see Integration matching in Methods). This figure plots a histogram of the effective integration windows after matching. b-d, These panels show the results of our applying our functional selectivity analyses to integration-matched responses. Format is the same as Fig. 5b-d.
Supplementary Fig. 1.
Source Data Fig. 4
Integration widths and centres for all electrodes along with their distance to primary auditory cortex and relevant metadata (that is, hemisphere, subject ID and electrode type).
Source Data Fig. 5
Principal component loadings plotted in Fig. 5b. Prediction accuracies for acoustic features and category labels for all electrodes along with relevant metadata (that is, hemisphere, subject ID, electrode type and reliability ceiling).
Rights and permissions
About this article
Cite this article
Norman-Haignere, S.V., Long, L.K., Devinsky, O. et al. Multiscale temporal integration organizes hierarchical computation in human auditory cortex. Nat Hum Behav 6, 455–469 (2022). https://doi.org/10.1038/s41562-021-01261-y
This article is cited by
The importance of temporal-fine structure to perceive time-compressed speech with and without the restoration of the syllabic rhythm
Scientific Reports (2023)
What auditory cortex is waiting for
Nature Human Behaviour (2022)