Abstract
Neocortical activity is thought to mediate voluntary control over vocal production, but the underlying neural mechanisms remain unclear. In a highly vocal rodent, the male Alston’s singing mouse, we investigate neural dynamics in the orofacial motor cortex (OMC), a structure critical for vocal behavior. We first describe neural activity that is modulated by component notes (~100 ms), probably representing sensory feedback. At longer timescales, however, OMC neurons exhibit diverse and often persistent premotor firing patterns that stretch or compress with song duration (~10 s). Using computational modeling, we demonstrate that such temporal scaling, acting through downstream motor production circuits, can enable vocal flexibility. These results provide a framework for studying hierarchical control circuits, a common design principle across many natural and artificial systems.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, M.A.L. (mlong@med.nyu.edu). This study did not generate new unique reagents. The datasets generated during this study are available upon request from the lead contact. Source code and documentation required for running all analyses are available.
Code availability
Analysis code is available in a GitHub repository (https://github.com/ccffccffcc/NNSingingMouse).
References
Banerjee, A. & Vallentin, D. Convergent behavioral strategies and neural computations during vocal turn-taking across diverse species. Curr. Opin. Neurobiol. 73, 102529 (2022).
Pika, S., Wilkinson, R., Kendrick, K. H. & Vernes, S. C. Taking turns: bridging the gap between human and animal communication. Proc. Biol. Sci. 285, 20180598 (2018).
Castellucci, G. A., Guenther, F. H. & Long, M. A. A theoretical framework for human and nonhuman vocal interaction. Annu. Rev. Neurosci. 45, 295–316 (2022).
Miller, C. T., Thomas, A. W., Nummela, S. U. & de la Mothe, L. A. Responses of primate frontal cortex neurons during natural vocal communication. J. Neurophysiol. 114, 1158–1171 (2015).
Roy, S., Zhao, L. & Wang, X. Distinct neural activities in premotor cortex during natural vocal behaviors in a new world primate, the common marmoset (Callithrix jacchus). J. Neurosci. 36, 12168–12179 (2016).
Hage, S. R., Gavrilov, N. & Nieder, A. Cognitive control of distinct vocalizations in rhesus monkeys. J. Cogn. Neurosci. 25, 1692–1701 (2013).
Hage, S. R. & Nieder, A. Single neurons in monkey prefrontal cortex encode volitional initiation of vocalizations. Nat. Commun. 4, 2409 (2013).
Castellucci, G. A., Kovach, C. K., Howard, M. A. 3rd, Greenlee, J. D. W. & Long, M. A. A speech planning network for interactive language use. Nature 602, 117–122 (2022).
Hage, S. R. & Nieder, A. Dual neural network model for the evolution of speech and language. Trends Neurosci. 39, 813–829 (2016).
Jürgens, U. The neural control of vocalization in mammals: a review. J. Voice 23, 1–10 (2009).
Nieder, A. & Mooney, R. The neurobiology of innate, volitional and learned vocalizations in mammals and birds. Phil. Trans. R. Soc. B 375, 20190054 (2020).
Zhang, Y. S. & Ghazanfar, A. A. A hierarchy of autonomous systems for vocal production. Trends Neurosci. 43, 115–126 (2020).
Kittelberger, J. M., Land, B. R. & Bass, A. H. Midbrain periaqueductal gray and vocal patterning in a teleost fish. J. Neurophysiol. 96, 71–85 (2006).
Bass, A. H. Central pattern generator for vocalization: is there a vertebrate morphotype? Curr. Opin. Neurobiol. 28, 94–100 (2014).
Jurgens, U. The role of the periaqueductal grey in vocal behaviour. Behav. Brain Res. 62, 107–117 (1994).
Zhang, S. P., Davis, P. J., Bandler, R. & Carrive, P. Brain stem integration of vocalization: role of the midbrain periaqueductal gray. J. Neurophysiol. 72, 1337–1356 (1994).
Tschida, K. et al. A specialized neural circuit gates social vocalizations in the mouse. Neuron 103, 459–472.e4 (2019).
Michael, V. et al. Circuit and synaptic organization of forebrain-to-midbrain pathways that promote and suppress vocalization. eLife 9, e63493 (2020).
Chen, J. et al. Flexible scaling and persistence of social vocal communication. Nature 593, 108–113 (2021).
Okobi, D. E. Jr, Banerjee, A., Matheson, A. M. M., Phelps, S. M. & Long, M. A. Motor cortical control of vocal interaction in neotropical singing mice. Science 363, 983–988 (2019).
Burkhard, T. T., Westwick, R. R. & Phelps, S. M. Adiposity signals predict vocal effort in Alston’s singing mice. Proc. R. Soc. B 285, 20180090 (2018).
Banerjee, A., Phelps, S. M. & Long, M. A. Singing mice. Curr. Biol. 29, R190–R191 (2019).
Zheng, D. J. et al. Mapping the vocal circuitry of Alston’s singing mouse with pseudorabies virus. J. Comp. Neurol. 530, 2075–2099 (2022).
Evarts, E. V. Relation of pyramidal tract activity to force exerted during voluntary movement. J. Neurophysiol. 31, 14–27 (1968).
Fee, M. S., Kozhevnikov, A. A. & Hahnloser, R. H. R. Neural mechanisms of vocal sequence generation in the songbird. Ann. N. Y. Acad. Sci. 1016, 153–170 (2004).
Margoliash, D. Acoustic parameters underlying the responses of song-specific neurons in the white-crowned sparrow. J. Neurosci. 3, 1039–1057 (1983).
Fetz, E. E. Are movement parameters recognizably coded in the activity of single neurons? Behav. Brain Sci. 15, 679–690 (1992).
Churchland, M. M. et al. Neural population dynamics during reaching. Nature 487, 51–56 (2012).
Shenoy, K. V., Sahani, M. & Churchland, M. M. Cortical control of arm movements: a dynamical systems perspective. Annu. Rev. Neurosci. 36, 337–359 (2013).
Long, M. A. & Fee, M. S. Using temperature to analyse temporal dynamics in the songbird motor pathway. Nature 456, 189–194 (2008).
Glaze, C. M. & Troyer, T. W. Temporal structure in zebra finch song: implications for motor coding. J. Neurosci. 26, 991–1005 (2006).
Tang, L. S. et al. Precise temperature compensation of phase in a rhythmic motor pattern. PLoS Biol. 8, e1000469 (2010).
Elmaleh, M., Kranz, D., Asensio, A. C., Moll, F. W. & Long, M. A. Sleep replay reveals premotor circuit structure for a skilled behavior. Neuron 109, 3851–3861.e4 (2021).
Yamaguchi, A., Gooler, D., Herrold, A., Patel, S. & Pong, W. W. Temperature-dependent regulation of vocal pattern generator. J. Neurophysiol. 100, 3134–3143 (2008).
Banerjee, A., Egger, R. & Long, M. A. Using focal cooling to link neural dynamics and behavior. Neuron 109, 2508–2518 (2021).
Crapse, T. B. & Sommer, M. A. Corollary discharge across the animal kingdom. Nat. Rev. Neurosci. 9, 587–600 (2008).
Houde, J. F. & Chang, E. F. The cortical computations underlying feedback control in vocal production. Curr. Opin. Neurobiol. 33, 174–181 (2015).
Eliades, S. J. & Wang, X. Neural substrates of vocalization feedback monitoring in primate auditory cortex. Nature 453, 1102–1106 (2008).
Eliades, S. J. & Miller, C. T. Marmoset vocal communication: behavior and neurobiology. Dev. Neurobiol. 77, 286–299 (2017).
Vallentin, D. & Long, M. A. Motor origin of precise synaptic inputs onto forebrain neurons driving a skilled behavior. J. Neurosci. 35, 299–307 (2015).
Economo, M. N. et al. Distinct descending motor cortex pathways and their roles in movement. Nature 563, 79–84 (2018).
Network, B. I. C. C. A multimodal cell census and atlas of the mammalian primary motor cortex. Nature 598, 86–102 (2021).
Warriner, C. L., Fageiry, S. K., Carmona, L. M. & Miri, A. Towards cell and subtype resolved functional organization: mouse as a model for the cortical control of movement. Neuroscience 450, 151–160 (2020).
Merel, J., Botvinick, M. & Wayne, G. Hierarchical motor control in mammals and machines. Nat. Commun. 10, 5489 (2019).
Lopes, G. et al. A robust role for motor cortex. Front. Neurosci. 17, 971980 (2023).
Ebbesen, C. L. & Brecht, M. Motor cortex—to act or not to act? Nat. Rev. Neurosci. 18, 694–705 (2017).
Wang, J., Narain, D., Hosseini, E. A. & Jazayeri, M. Flexible timing by temporal scaling of cortical responses. Nat. Neurosci. 21, 102–110 (2018).
Remington, E. D., Egger, S. W., Narain, D., Wang, J. & Jazayeri, M. A dynamical systems perspective on flexible motor timing. Trends Cogn. Sci. 22, 938–952 (2018).
Mello, G. B., Soares, S. & Paton, J. J. A scalable population code for time in the striatum. Curr. Biol. 25, 1113–1122 (2015).
Paton, J. J. & Buonomano, D. V. The neural basis of timing: distributed mechanisms for diverse functions. Neuron 98, 687–705 (2018).
Xu, M., Zhang, S. Y., Dan, Y. & Poo, M. M. Representation of interval timing by temporally scalable firing patterns in rat prefrontal cortex. Proc. Natl Acad. Sci. USA 111, 480–485 (2014).
Remington, E. D., Narain, D., Hosseini, E. A. & Jazayeri, M. Flexible sensorimotor computations through rapid reconfiguration of cortical dynamics. Neuron 98, 1005–1019.e5 (2018).
De Corte, B. J., Akdogan, B. & Balsam, P. D. Temporal scaling and computing time in neural circuits: should we stop watching the clock and look for its gears? Front. Behav. Neurosci. 16, 1022713 (2022).
Mita, A., Mushiake, H., Shima, K., Matsuzaka, Y. & Tanji, J. Interval time coding by neurons in the presupplementary and supplementary motor areas. Nat. Neurosci. 12, 502–507 (2009).
Renoult, L., Roux, S. & Riehle, A. Time is a rubberband: neuronal activity in monkey motor cortex in relation to time estimation. Eur. J. Neurosci. 23, 3098–3108 (2006).
Saxena, S., Russo, A. A., Cunningham, J. & Churchland, M. M. Motor cortex activity across movement speeds is predicted by network-level strategies for generating muscle activity. eLife 11, e67620 (2022).
Stroud, J. P., Porter, M. A., Hennequin, G. & Vogels, T. P. Motor primitives in space and time via targeted gain modulation in cortical networks. Nat. Neurosci. 21, 1774–1783 (2018).
Pachitariu, M., Steinmetz, N., Kadir, S., Carandini, M. & Kenneth D. H. Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels. Preprint at bioRxiv https://doi.org/10.1101/061481 (2016).
Rossant, C. et al. Spike sorting for large, dense electrode arrays. Nat. Neurosci. 19, 634–641 (2016).
Jenks, G. F. The data model concept in statistical mapping. Int. Yearb. Cartogr. 7, 186–190 (1967).
Acknowledgements
We thank S. Shea, F. Albeanu, W. Bast, J. del Rosario, H. Sloin and members of the Long and Banerjee laboratories for comments on earlier versions of the manuscript. A. Paulson provided technical assistance. Funding was provided by the National Institutes of Health grant R01 NS113071 (M.A.L., S.D.), Simons Collaboration on the Global Brain (M.A.L., S.D.), Searle Scholars Program (A.B.), Klingenstein–Simons fellowship (A.B.) and the Simons Foundation Junior Fellows Program (A.B.). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
A.B. and M.A.L. conceived the project; A.B., F.C., S.D. and M.A.L. designed the methodology. A.B. and M.A.L. performed the investigation. A.B., F.C., S.D. and M.A.L. visualized the project. A.B., S.D. and M.A.L. acquired funding. S.D. and M.A.L. administered and supervised the project. A.B. and M.A.L. wrote the original draft of the manuscript; A.B., F.C., S.D. and M.A.L. contributed to writing, review and editing.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Neuroscience thanks Steffen Hage, Mehrdad Jazayeri and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Determination of significant note- and song-related responses.
(a,b) Example neurons with (a, Cell #19) and without (b, Cell #1) significant note modulation. Spike rasters (top) and spike probability density plots (bottom) for example neurons whose activity profiles have been linearly warped to a common note duration (onsets indicated by green lines). Each row represents the warped spike raster of a neuron aligned to the beginning of a sequence of three notes; responses are sorted based on the original duration of the first note produced in this sequence from longest (top) to shortest (bottom). At right, polar plots describing the tuning of spike times with respect to the relative phase of note production. Dashed lines indicate a uniform distribution. (c) Histogram of note modulation (see Methods) for significantly note-modulated neurons (n = 111) compared with the same analysis applied to nonsinging epochs. (d) Song modulation analysis protocol. Neural activity for songs (black rectangles) are aligned either to their starts (top) or stops (bottom). The evaluation window (song epoch) begins and ends two seconds before and after the shortest song duration of that session. (e) The relative firing rate difference between the song-aligned spiking activity and a nonsinging period for a modulated (left, Cell #169) and unmodulated (right, Cell #187) neuron. 72 song trials are represented by separate lines for each neuron. Significance determined by bootstrap resampling (***: p < 0.01 two-sided test, n.s.: not significant). (f) Histogram of song modulation values (see Methods) for all song modulated neurons (n = 133) and those not modulated by song (n = 242).
Extended Data Fig. 2 Further characterization of note-related responses.
(a and b) Spike times of two example neurons – Cell #5 (a) and Cell #19 (b) - linearly warped to a common note duration (onsets indicated by dashed lines). Each row represents the warped spike raster of a neuron aligned to the beginning of a sequence of three notes; responses are sorted based on the original duration of the first note produced in this sequence from longest (top) to shortest (bottom). Examples in (a) and (b) relate to analyses in Fig. 2e. (c) Spiking activity corresponding to note timing for an example neuron (Cell #180 from Mouse #4). For visualization, analysis was restricted to notes of prespecified durations (top: 55 to 60 ms; bottom: 150 to 200 ms, sample note sonograms provided for each range). For long note durations, robust spiking emerges near the end of each note. Green and red ticks indicate the onset and offset of notes, respectively. (d) Spiking activity from Cell #180 linearly warped to a common note duration (onsets indicated by dashed lines). Timing shifted by a best fit latency of 110 ms (sensory-like shift). (e) Summary plot (extension from Fig. 2f) showing the latency resulting in the maximum note modulation strength for all note modulated neurons (n = 111). Gray symbols represent cases that are not significantly different from zero, and red (n = 23) and blue (n = 2) symbols represent points with sensory and motor offsets, respectively.
Extended Data Fig. 3 Song-modulated neurons.
(a-c) Spiking raster plots for three example neurons – Cell #19 (a), Cell #5 (b), and Cell #176 (c) – across all trials. At right, a peri-song time histogram (PSTH) for song blocks representing the shortest and longest songs in the session (indicated by cyan and magenta vertical lines on right of raster plots). Black curve represents temporally compressed PSTHs from longest trials as a comparison. The magnitude of compression was chosen to match the ratio of the song durations. (d-f) Spike times of neurons in (a-c) after temporally warping to the beginning and end of song. Green and red lines indicate the onset and offset of songs, respectively.
Extended Data Fig. 4 Quantifying neural scaling as a function of behavior across categories.
(a,b) The ratio of the neural scaling factor (Sneural) to the behavioral scaling factor (Sbehavioral) with neurons grouped across different categories, namely reactive versus spontaneous singing (a) and animal ID (b). Each dot represents a comparison of similarly timed trial blocks (n = 4 - 21) for an individual neuron; quantifications denote median ± MAD. For the analysis shown in (b), Mouse #1 was limited in its total number of song trials per session with our original stringent criterion for significance threshold (p = 0.01), which prevented us from testing our hypothesis. We therefore relaxed this threshold across all animals to p = 0.05, which enabled a direct comparison of scaling factor. In all cases for both (a) and (b), the median neural:behavioral scaling ratio overlapped with 1, which denotes perfect co-variance between the duration of the song and the underlying OMC neural dynamics. See Methods and Fig. 3h for further information concerning how these parameters were calculated. Two-sided tests were used unless specified otherwise.
Extended Data Fig. 5 Cross-validation of hierarchical clustering.
(a) Shown are the results of hierarchical clustering performed on the training (left) and test (right) set of trials sorted with respect to cluster affiliation (left). (b) Cross-validated firing rate profiles of the eight clusters evaluated on the training (solid) and the test (dashed) data set. (c) The ratio of the neural scaling factor (Sneural) to the behavioral scaling factor (Sbehavioral) with neurons grouped across different categories. Each dot represents a comparison of similarly timed trial blocks (n = 4–21) for an individual neuron; quantifications denote median ± MAD. In all cases, the median neural: behavioral scaling ratio overlapped with 1, which denotes perfect co-variance between the duration of the song and the underlying OMC neural dynamics. See Methods and Fig. 3h for further information concerning how these parameters were calculated.
Extended Data Fig. 6 Details of the computational model.
(a) Inferred weights (shown at left) for each song-modulated OMC neuron (shown in middle) which leads to a descending synaptic drive (shown at right) to the downstream note pattern generator. (b) An alternative implementation of the hierarchical model, in which the note pattern generator produces a song by combining an unscaled step-like input with a characteristic time-dependent adaptation. These inputs could be intrinsic to the pattern generator or could be inherited from a different brain area. In both cases, time-scaled OMC activity can interface with the existing note generating mechanism to produce adaptive behavioral variability. (c) In the absence of the OMC input, the note pattern generator can produce notes but loses flexibility resulting in songs with higher stereotypy, consistent with a partially autonomous motor control system.
Supplementary information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Banerjee, A., Chen, F., Druckmann, S. et al. Temporal scaling of motor cortical dynamics reveals hierarchical control of vocal production. Nat Neurosci 27, 527–535 (2024). https://doi.org/10.1038/s41593-023-01556-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41593-023-01556-5