Sounds can arise from the environment and also predictably from many of our own movements, such as vocalizing, walking, or playing music. The capacity to anticipate these movement-related (reafferent) sounds and distinguish them from environmental sounds is essential for normal hearing1,2, but the neural circuits that learn to anticipate the often arbitrary and changeable sounds that result from our movements remain largely unknown. Here we developed an acoustic virtual reality (aVR) system in which a mouse learned to associate a novel sound with its locomotor movements, allowing us to identify the neural circuit mechanisms that learn to suppress reafferent sounds and to probe the behavioural consequences of this predictable sensorimotor experience. We found that aVR experience gradually and selectively suppressed auditory cortical responses to the reafferent frequency, in part by strengthening motor cortical activation of auditory cortical inhibitory neurons that respond to the reafferent tone. This plasticity is behaviourally adaptive, as aVR-experienced mice showed an enhanced ability to detect non-reafferent tones during movement. Together, these findings describe a dynamic sensory filter that involves motor cortical inputs to the auditory cortex that can be shaped by experience to selectively suppress the predictable acoustic consequences of movement.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Schneider, D. M., Nelson, A. & Mooney, R. A synaptic and circuit basis for corollary discharge in the auditory cortex. Nature 513, 189–194 (2014).
Weiss, C., Herwig, A. & Schütz-Bosbach, S. The self in action effects: selective attenuation of self-generated sounds. Cognition 121, 207–218 (2011).
Kuchibhotla, K. V. et al. Parallel processing by cortical inhibition enables context-dependent behavior. Nat. Neurosci. 20, 62–71 (2017).
Zhou, M. et al. Scaling down of balanced excitation and inhibition by active behavioral states in auditory cortex. Nat. Neurosci. 17, 841–850 (2014).
Rummell, B. P., Klee, J. L. & Sigurdsson, T. Attenuation of responses to self-generated sounds in auditory cortical neurons. J. Neurosci. 36, 12010–12026 (2016).
Flinker, A. et al. Single-trial speech suppression of auditory cortex activity in humans. J. Neurosci. 30, 16643–16650 (2010).
Eliades, S. J. & Wang, X. Sensory-motor interaction in the primate auditory cortex during self-initiated vocalizations. J. Neurophysiol. 89, 2194–2207 (2003).
Singla, S., Dempsey, C., Warren, R., Enikolopov, A. G. & Sawtell, N. B. A cerebellum-like circuit in the auditory system cancels responses to self-generated sounds. Nat. Neurosci. 20, 943–950 (2017).
Curio, G., Neuloh, G., Numminen, J., Jousmäki, V. & Hari, R. Speaking modifies voice-evoked activity in the human auditory cortex. Hum. Brain Mapp. 9, 183–191 (2000).
Keller, G. B. & Hahnloser, R. H. R. Neural processing of auditory feedback during vocal practice in a songbird. Nature 457, 187–190 (2009).
Eliades, S. J. & Wang, X. Neural substrates of vocalization feedback monitoring in primate auditory cortex. Nature 453, 1102–1106 (2008).
Houde, J. F. & Jordan, M. I. Sensorimotor adaptation in speech production. Science 279, 1213–1216 (1998).
Mifsud, N. G. & Whitford, T. J. Sensory attenuation of self-initiated sounds maps onto habitual associations between motor action and sound. Neuropsychologia 103, 38–43 (2017).
Moore, A. K. & Wehr, M. Parvalbumin-expressing inhibitory interneurons in auditory cortex are well-tuned for frequency. J. Neurosci. 33, 13713–13723 (2013).
Fino, E. & Yuste, R. Dense inhibitory connectivity in neocortex. Neuron 69, 1188–1203 (2011).
Znamenskiy, P. et al. Functional selectivity and specific connectivity of inhibitory neurons in primary visual cortex. Preprint at https://www.biorxiv.org/content/early/2018/04/04/294835 (2018).
Williamson, R. S., Hancock, K. E., Shinn-Cunningham, B. G. & Polley, D. B. Locomotion and task demands differentially modulate thalamic audiovisual processing during active search. Curr. Biol. 25, 1885–1891 (2015).
Nelson, A. et al. A circuit for motor cortical modulation of auditory cortical activity. J. Neurosci. 33, 14342–14353 (2013).
Nelson, A. & Mooney, R. The basal forebrain and motor cortex provide convergent yet distinct movement-related inputs to the auditory cortex. Neuron 90, 635–648 (2016).
Wilson, N. R., Runyan, C. A., Wang, F. L. & Sur, M. Division and subtraction by distinct cortical inhibitory networks in vivo. Nature 488, 343–348 (2012).
Wolpert, D. M., Ghahramani, Z. & Jordan, M. I. An internal model for sensorimotor integration. Science 269, 1880–1882 (1995).
Keller, G. B., Bonhoeffer, T. & Hübener, M. Sensorimotor mismatch signals in primary visual cortex of the behaving mouse. Neuron 74, 809–815 (2012).
Froemke, R. C., Merzenich, M. M. & Schreiner, C. E. A synaptic memory trace for cortical receptive field plasticity. Nature 450, 425–429 (2007).
Froemke, R. C. et al. Long-term modification of cortical synapses improves sensory perception. Nat. Neurosci. 16, 79–88 (2013).
McGinley, M. J., David, S. V. & McCormick, D. A. Cortical membrane potential signature of optimal states for sensory signal detection. Neuron 87, 179–192 (2015).
Leinweber, M., Ward, D. R., Sobczak, J. M., Attinger, A. & Keller, G. B. A sensorimotor circuit in mouse cortex for visual flow predictions. Neuron 95, 1420–1432.e5 (2017).
Franklin, K. B. & Paxinos, G. The Mouse Brain in Stereotaxic Coordinates, Compact. The Coronal Plates and Diagrams (Elsevier, Amsterdam, 2008).
Glickfeld, L. L., Histed, M. H. & Maunsell, J. H. Mouse primary visual cortex is used to detect both orientation and contrast changes. J. Neurosci. 33, 19416–19422 (2013).
We thank K. Tschida, M. Tanaka, and D. Purves for their comments on this manuscript; members of the Mooney laboratory for discussions regarding experimental design and data analysis; J. Pearson for comments regarding statistical analyses; and M. Booze for animal care and technical support. This research was supported by an HHMI fellowship of the Helen Hay Whitney Foundation and a Career Award at the Scientific Interface from the Burroughs Wellcome Fund (D.M.S.), the Holland-Trice Graduate Fellowship in Brain Sciences (J.S.), and NIH grant 5 R01 DC013826 (R.M.).
Nature thanks S. Eliades, G. Keller and the other anonymous reviewer(s) for their contribution to the peer review of this work.
The authors declare no competing interests.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
a, Heat map showing the rate of tone presentation as a function of instantaneous stepping rate with a single paw, measured via simultaneous videography. Data points show mean ± s.d. tone rate and stepping rate in 1-Hz (1 s–1) bins. Red dashed line shows linear regression through all data points. Reafferent tones during aVR experience were strongly correlated to instantaneous paw stepping rate (0.78). Data are from 3,716 steps recorded from 1,804 s of video from two mice. b, Average tone presentation rate during aVR experience closely matches average stepping rate measured either with a single paw or two paws. Dots are median and error bars are s.d. c, Cumulative distance run by 11 mice over 6–9 days of aVR experience. Each line is for a different mouse, colour-coded by the reafferent frequency to which the mouse was acclimated. d, Cumulative number of tones heard by same 11 mice as in c.
Extended Data Fig. 2 aVR experience alters locomotion-related suppression at the level of individual neurons.
a, Fraction of neurons with elevated firing rates (magenta) and suppressed firing rates (cyan) in response to tones during rest. A roughly equal number of neurons were excited by the reafferent frequency as were excited by other frequencies. b, Fraction of rest-responsive neurons with elevated firing rates (magenta) and suppressed firing rates (cyan) in response to tones of varying frequency during running. Nearly 50% of neurons were responsive to non-reafferent frequencies during running, whereas fewer than 25% were responsive to the reafferent frequency. c, Heat map showing response strength (tone-evoked rate – baseline rate) for neurons responsive to the expected reafferent frequency (left, n = 114 neurons, N = 11 mice) and another frequency (+2 octaves, n = 120 neurons, N = 11 mice) during rest. Neurons ordered by magnitude of response independently for each heat map. d, Response strengths of the neurons in c during running. Neurons are re-sorted by magnitude of response. Twenty-three per cent of neurons retained their response to the reafferent frequency during running, consistent with a sparse representation of expected reafferent sounds. e, Two alternative models for how locomotion-related suppression could change following aVR experience. In each model, the black curves show frequency tuning curves of three neurons during rest, red curves during running, and the green dashed line indicates the reafferent frequency. Across-neuron model: locomotion-related suppression is uniform across frequencies within a neuron but is strongest for neurons that are strongly responsive to the expected reafferent frequency. Within-neuron model: suppression is non-uniform at the single neuron level and regardless of how strongly the neuron responds to the expected reafferent frequency, suppression is always strongest at the reafferent frequency. f, Tuning curves for five example neurons measured during rest (black) and running (red). The best frequency (BF) for each neuron is shown by the blue triangle, and the reafferent frequency to which each mouse was acclimated is shown by the green dashed line. In all five neurons, locomotion-related suppression was strong at the reafferent frequency relative to other frequencies, regardless of the neuron’s best frequency. g, Neurons were sorted by their best frequency, measured relative to the reafferent frequency that each mouse experienced. Locomotion-related suppression at the expected reafferent frequency (green) and averaged across all non-reafferent frequencies (black). Regardless of a neuron’s best frequency, suppression was always strongest at the reafferent frequency, supporting the within-neuron model in e. Sample size: N = 11 mice, n = 314 neurons. Shaded regions show 95% confidence bounds estimated with a bootstrap analysis repeated 1,000 times. h, Probability of observing a minima in the gain function of individual neurons at each frequency, measured relative to the reafferent frequency. A substantial number of neurons had minima in their gain functions at the expected reafferent frequency, further supporting the within-neuron model in e. Sample size: N = 11 mice, n = 314 neurons. Shaded region shows a null distribution, which we estimated by randomly assigning to each neuron a reafferent frequency rather than using the actual frequency experienced by the mouse from which the neuron was recorded. This shuffling was performed 1,000 times and the 95% confidence bounds of the distribution were computed. Error bars show the 95% confidence bounds estimated from a bootstrap analysis repeated 1,000 times.
a, Locomotion-related gain tested at half-octave spacing from the reafferent frequency. Neuronal responses to frequencies half an octave from the reafferent frequency were suppressed at an intermediate level. Data are mean ± s.e. Sample size: N = 4 mice, n = 106 neurons. b, Example current-source density triggered by tone-onset for electrode recordings made perpendicular to the auditory cortical surface. Black dashed line demarcates putative supragranular (SG) and infragranular (IG) layers of cortex. Electrode 1 is the most superficial; electrode spacing is 100 μm. c, Example tone-evoked local field potential (LFP) traces from an SG electrode (left) and an IG electrode (right) in response to the expected reafferent frequency (left) and a non-reafferent frequency (right). Locomotion-related suppression of LFP responses was stronger for the reafferent frequency than for non-reafferent frequencies. Data are mean ± s.e. d, The difference in LFP between rest and running as a function of electrode location (1 is the most superficial; electrode spacing is 100 μm; N = 3 mice). Positive values indicate greater suppression during locomotion.
Extended Data Fig. 4 Frequency-specific locomotion-related suppression requires several days of coupled sensory–motor experience.
a, Example sensory–motor experience during anti-coupled aVR experience. Mice did not hear tones while running, but tones were played back during subsequent resting periods with inter-tone intervals drawn from the intervals that mice should have heard while running. b, Population PSTHs for the expected frequency (left) and for non-reafferent frequencies (right) during rest (black) and running (red) following anti-coupled aVR. Anti-coupled aVR experience does not lead to changes in auditory responsiveness during running or rest. Sample size: N = 4 mice, n = 97 neurons. Shaded region shows mean ± s.e. P = 0.57, two-sided Wilcoxon rank sum test. c, Example sensory–motor experience during metronome aVR experience. Tones were presented during running at a fixed rate (2 s–1) but the tone rate was not modulated by running speed. d, Population PSTHs for the expected frequency (left) and for non-reafferent frequencies (right) during rest (black) and running (red) following metronome aVR. Metronome aVR experience does not lead to changes in auditory responsiveness during running or rest. Sample size: N = 2 mice, n = 49 neurons. Shaded region shows mean ± s.e. P = 0.57, two-sided Wilcoxon rank sum test. e, Mice were acclimated to aVR for 7 days. On the day of electrophysiology, we altered on each locomotor bout the sound produced by the treadmill to be either expected (blue) or a non-reafferent frequency (2 octaves away, red). We then analysed responses (N = 4 mice, n = 74 neurons) to each sound frequency during rest (R) and to the first five tones heard at the beginning of each bout of locomotion (L1–L5). (i) Tone-evoked responses (population PSTHs) to the reafferent (blue) and a non-reafferent sound (red) during rest. (ii) Tone-evoked responses during locomotion to the first five tones in a series of the expected reafferent frequency. (iii) Tone-evoked responses during locomotion to the first five tones heard in a series of non-reafferent tones. f, Firing rates to the reafferent (blue) and non-reafferent (red) reafferent sounds during rest (R) and during the first five tones heard during locomotion (L1–L5). Responses to the first tone heard during locomotion were significantly suppressed only if that tone matched the expected reafferent frequency (blue asterisk, P = 0.002, two-sided Wilcoxon signed rank test). Black asterisks indicate significant differences between firing rates to the reafferent and non-reafferent reafferent sounds (L1, P = 0.002; L2, P = 0.03; L3, P = 0.007, two-sided Wilcoxon rank sum test). Sample size: N = 4 mice, n = 74 neurons. Red n.s. indicates that evoked responses to the first tone heard during a bout of running are not significantly different from those evoked during rest for non-reafferent tones (P = 0.4, two-sided Wilcoxon signed rank test). g, Population PSTHs for the expected frequency (left) and for non-reafferent frequencies (right) during rest (black) and running (red). Data were collected from three mice (n = 67 neurons) after each mouse’s first experience of hearing fixed-frequency reafferent tones for 1 h, during which time mice heard 927, 3,167 and 1,069 reafferent tones at 16 kHz, 2 kHz and 16 kHz, respectively. This experience was insufficient to shift the locomotion-related suppression towards the reafferent frequency. Shaded region shows mean ± s.e. P = 0.47, two-sided Wilcoxon rank sum test.
a, Voltage trace of a pi-IN recorded from a VGAT::ChR2 mouse in response to a 100-ms pulse of blue light targeted to the cortical surface. Inset shows example waveforms belonging to the sorted unit (black) and belonging to the noise cluster (magenta), showing good electrophysiological isolation. b, Rasters showing response of the same neuron to 30 pulses of blue light (100 ms each). c, Tone-evoked responses of auditory cortical inhibitory neurons (VGAT+) during rest (black) and locomotion (red) in response to reafferent (left) and non-reafferent (right) frequencies. Responses are suppressed during locomotion, but suppression is not specific to the reafferent frequency. Sample size: N = 5 mice, n = 71 neurons. Shaded region shows mean ± s.e. P = 0.36, two-sided Wilcoxon rank sum test. d, Spontaneous firing rate during rest and locomotion for 93 putative excitatory neurons (non-photo-identified in VGAT::ChR2 mice, N = 7 mice). Filled circle shows mean. Firing rates were significantly lower during running relative to rest (two-sided Wilcoxon signed rank test). e, pi-INs (VGAT+) that were more strongly driven by the reafferent frequency were more strongly recruited during running. N = 2 mice, n = 47 neurons. Black line and shaded area show linear regression and 95% confidence bounds from a bootstrap analysis repeated 1,000 times, respectively. The P value represents the probability that the slope of the regression line includes zero, estimated from the bootstrap analysis. f, Tone-evoked responses during running and rest for the reafferent frequency (blue) and non-reafferent frequencies (±2 octaves, red). Dots are responses of individual neurons (N = 11 mice, n = 317), lines are linear regression, and shaded regions are 95% confidence bounds from bootstrap analysis repeated 1,000 times. Suppression to non-reafferent sounds is best fit as a gain model (slope = 0.47 ± 0.05; offset = –0.19 ± 0.70), whereas suppression of expected reafferent tones has a stronger gain component (that is, shallower slope, two-sided Wilcoxon rank sum test, P = 3.3 × 10−317) and an offset term that is significantly different from zero (slope = 0.27 ± 0.4; offset = –3.55 ± 0.58. two-sided signed rank test, P = 3.3 × 10−165). Inset shows a zoom in of the regression lines near the origin. These data suggest that suppression of expected reafferent sounds involves both divisive and subtractive forms of inhibition. g, Responses to a non-reafferent tone in VGAT+ pi-INs recorded from aVR-acclimated mice were weakly correlated with responses to electrical stimulation in M2 (n = 75 neurons from 5 mice). These data indicate that the strong relationship between tone-evoked responses and M2 stimulation responses in auditory cortical pi-INs is distinct to the reafferent frequency. Black line and shaded area show linear regression and 95% confidence bounds from a bootstrap analysis repeated 1,000 times, respectively. h, Responses to the expected reafferent tone in put-ENs recorded from aVR-acclimated mice were correlated with responses to electrical stimulation in M2 (n = 181 neurons from 5 VGAT::ChR2 mice). This effect size for put-ENs is significantly weaker than for pi-INs. Black line and shaded area show linear regression and 95% confidence bounds from a bootstrap analysis repeated 1,000 times, respectively. I, Responses to a non-reafferent tone in VGAT+ pi-INs recorded from naive mice were weakly correlated with responses to electrical stimulation in M2 (n = 41 neurons from 2 mice). Black line and shaded area show linear regression and 95% confidence bounds from a bootstrap analysis repeated 1,000 times, respectively. j, Slope of the linear fit for the relationship shown in Fig. 3i. Error bars show 95% confidence bounds from a bootstrap analysis. Data are from regressions shown in Fig. 3g and Extended Data Fig. 5g–i. Slopes of linear fit for PV, VGAT, SST, and all pi-INs are significantly larger than slopes of linear fits for non-reafferent and naive conditions (P < 0.01, Wilcoxon). Bar height determined by linear fit of raw data; error bars show s.e. of linear fits from 1,000 repetitions of bootstrap analysis.
Extended Data Fig. 6 Tone detection behaviour is compromised by locomotion, is auditory-cortex dependent, and adapts following VR experience.
a, Data points show mean and s.e. detection rates for N = 4 mice as a function of tone intensity for trials performed during rest with infusion of either saline (black) or muscimol (magenta) into the auditory cortex. b, Difference in performance as a function of intensity for each mouse (grey dots). Large connected dots show mean difference in performance and coloured dots indicate intensities at which performance was significantly different (P < 0.05) across conditions (N = 19 mice, repeated measures two-way ANOVA followed by post-hoc Tukey test). c, Tone-evoked responses from putative excitatory neurons recorded from VGAT::ChR2 mouse without (black) and with (blue) simultaneous blue laser stimulation. Optogenetic activation of inhibitory neurons decreases the spontaneous and tone-evoked firing rates of excitatory neurons. n = 23 neurons, N = 1 mouse. d, Tone-evoked firing rates during rest are weaker during optogenetic activation of inhibitory interneurons. Dashed line is unity. (n = 23 neurons, N = 1 mouse; P < 0.05, two-sided paired t-test.) e, Tone detection performance (N = 6 mice) during rest (black) and rest with optogenetic activation of auditory cortical inhibitory neurons (blue). Mice were worse at detecting tones on optogenetic trials (repeated measures two-way ANOVA, factors: intensity × laser state, P(intensity × laser state) = 0.0028, F(2, 10) = 11.23, post-hoc Tukey test at individual intensities, blue asterisk, P < 0.05 on laser trials) compared to rest. f, Tone detection performance (N = 6 mice) during rest (black) and rest with optogenetic activation of M2 terminals in auditory cortex (blue). Four of these mice were presented with 8-kHz tones and the remaining two were presented with 4-kHz tones. Mice were worse at detecting tones on optogenetic trials regardless of the tone frequency. (Statistics similar to e, P(intensity × laser state) = 0.01, F(2, 10) = 6.66, blue asterisk, P < 0.05 on laser trials). g, Average psychometric functions (N = 3 mice) showing detection rates as a function of tone intensity for trials performed during rest when visual cortex was inhibited. (repeated measures two-way ANOVA, P(intensity × laser state) = 0.33, F(2, 4) = 1.47). h, Average psychometric functions (N = 2 mice) showing detection rates as a function of tone intensity for trials performed during rest (black) and during rest with laser stimulation (blue) by mice injected with an AAV encoding eGFP in M2. These controls show that laser stimulation of auditory cortex in the absence of ChR2 does not influence behaviour. i, Average psychometric functions (N = 8 mice) showing detection rates as a function of tone intensity for trials performed during rest (black) and during rest with laser stimulation (blue) when the optical fibre was placed over intact skull near, but not directly over auditory cortex. Five of eight mice were injected with an AAV encoding ChR2 into M2, of which three were presented with 8-kHz tones and 2 with 4-kHz tones. The other three were VGAT::ChR2 mice presented with 8-kHz tones. These controls show that sham laser stimulation (which is visible to the mouse) alone improves behaviour (repeated measures two-way ANOVA, factors: intensity × laser state, P(interaction) = 0.0066, F(2, 14) = 7.35, post-hoc Tukey tests, blue asterisk, P < 0.05). j, Difference in hit rates in response to tone A relative to tone B during rest before (pre) and after (post) aVR experience with tone A.). Lines represent mean difference and shaded regions show s.e. for N = 10 mice. There is no difference in rest performance before and after aVR experience. (repeated measures two-way ANOVA in each panel, factors: intensity × time of testing, P(time of testing) = 0.46, F(1, 9) = 0.61). k, Difference in hit rates in response to tone A relative to tone B during running before (pre) and after (post) aVR experience with tone A. Lines represent mean difference and shaded regions show s.e. for N = 10 mice. Mice are significantly better at detecting tone B than tone A after aVR experience, indicating that this is a movement-specific change (repeated measures two-way ANOVA in each panel, factors: intensity × time of testing, P(time of testing) = 0.04, F(1, 9) = 8.07, red asterisk, P < 0.05, p values in j, k corrected using the Holm–Bonferroni method. For further statistical details, see Supplementary Table 1.
This file contains Supplementary Table 1: Summary of measured values and statistics described in Fig. 4 and Extended Data Fig. 6.
Mouse running on aVR treadmill producing tones with a fixed frequency (4 kHz) that are yoked to the mouse’s speed.
Mouse performing tone-detection task while resting (trials 1 and 2 in the video) and running (trials 3 and 4 in the video). Red LED indicates each time the lickometer detects a lick. Mouse was performing the single-frequency version of the task. In this example video, all tones were presented at the 60 dB and the mouse correctly detected all of them.
About this article
Developmental Neurobiology (2019)
Trends in Cognitive Sciences (2019)
Frontiers in Systems Neuroscience (2019)
Frontiers in Behavioral Neuroscience (2019)
Current Biology (2019)