Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Temporal scaling of motor cortical dynamics reveals hierarchical control of vocal production

Abstract

Neocortical activity is thought to mediate voluntary control over vocal production, but the underlying neural mechanisms remain unclear. In a highly vocal rodent, the male Alston’s singing mouse, we investigate neural dynamics in the orofacial motor cortex (OMC), a structure critical for vocal behavior. We first describe neural activity that is modulated by component notes (~100 ms), probably representing sensory feedback. At longer timescales, however, OMC neurons exhibit diverse and often persistent premotor firing patterns that stretch or compress with song duration (~10 s). Using computational modeling, we demonstrate that such temporal scaling, acting through downstream motor production circuits, can enable vocal flexibility. These results provide a framework for studying hierarchical control circuits, a common design principle across many natural and artificial systems.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Reliable cortical population activity during singing in S. teguina.
Fig. 2: Note-related activity of OMC neurons.
Fig. 3: Scaling of neural activity with song duration.
Fig. 4: Diverse categories of OMC firing patterns during singing.
Fig. 5: Hierarchical model of vocal motor control.

Similar content being viewed by others

Data availability

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, M.A.L. (mlong@med.nyu.edu). This study did not generate new unique reagents. The datasets generated during this study are available upon request from the lead contact. Source code and documentation required for running all analyses are available.

Code availability

Analysis code is available in a GitHub repository (https://github.com/ccffccffcc/NNSingingMouse).

References

  1. Banerjee, A. & Vallentin, D. Convergent behavioral strategies and neural computations during vocal turn-taking across diverse species. Curr. Opin. Neurobiol. 73, 102529 (2022).

    CAS  PubMed  Google Scholar 

  2. Pika, S., Wilkinson, R., Kendrick, K. H. & Vernes, S. C. Taking turns: bridging the gap between human and animal communication. Proc. Biol. Sci. 285, 20180598 (2018).

    PubMed Central  PubMed  Google Scholar 

  3. Castellucci, G. A., Guenther, F. H. & Long, M. A. A theoretical framework for human and nonhuman vocal interaction. Annu. Rev. Neurosci. 45, 295–316 (2022).

    CAS  PubMed Central  PubMed  Google Scholar 

  4. Miller, C. T., Thomas, A. W., Nummela, S. U. & de la Mothe, L. A. Responses of primate frontal cortex neurons during natural vocal communication. J. Neurophysiol. 114, 1158–1171 (2015).

    CAS  PubMed Central  PubMed  Google Scholar 

  5. Roy, S., Zhao, L. & Wang, X. Distinct neural activities in premotor cortex during natural vocal behaviors in a new world primate, the common marmoset (Callithrix jacchus). J. Neurosci. 36, 12168–12179 (2016).

    CAS  PubMed Central  PubMed  Google Scholar 

  6. Hage, S. R., Gavrilov, N. & Nieder, A. Cognitive control of distinct vocalizations in rhesus monkeys. J. Cogn. Neurosci. 25, 1692–1701 (2013).

    PubMed  Google Scholar 

  7. Hage, S. R. & Nieder, A. Single neurons in monkey prefrontal cortex encode volitional initiation of vocalizations. Nat. Commun. 4, 2409 (2013).

    ADS  PubMed  Google Scholar 

  8. Castellucci, G. A., Kovach, C. K., Howard, M. A. 3rd, Greenlee, J. D. W. & Long, M. A. A speech planning network for interactive language use. Nature 602, 117–122 (2022).

    ADS  CAS  PubMed Central  PubMed  Google Scholar 

  9. Hage, S. R. & Nieder, A. Dual neural network model for the evolution of speech and language. Trends Neurosci. 39, 813–829 (2016).

    CAS  PubMed  Google Scholar 

  10. Jürgens, U. The neural control of vocalization in mammals: a review. J. Voice 23, 1–10 (2009).

    PubMed  Google Scholar 

  11. Nieder, A. & Mooney, R. The neurobiology of innate, volitional and learned vocalizations in mammals and birds. Phil. Trans. R. Soc. B 375, 20190054 (2020).

    PubMed  Google Scholar 

  12. Zhang, Y. S. & Ghazanfar, A. A. A hierarchy of autonomous systems for vocal production. Trends Neurosci. 43, 115–126 (2020).

    CAS  PubMed Central  PubMed  Google Scholar 

  13. Kittelberger, J. M., Land, B. R. & Bass, A. H. Midbrain periaqueductal gray and vocal patterning in a teleost fish. J. Neurophysiol. 96, 71–85 (2006).

    PubMed  Google Scholar 

  14. Bass, A. H. Central pattern generator for vocalization: is there a vertebrate morphotype? Curr. Opin. Neurobiol. 28, 94–100 (2014).

    CAS  PubMed  Google Scholar 

  15. Jurgens, U. The role of the periaqueductal grey in vocal behaviour. Behav. Brain Res. 62, 107–117 (1994).

    CAS  PubMed  Google Scholar 

  16. Zhang, S. P., Davis, P. J., Bandler, R. & Carrive, P. Brain stem integration of vocalization: role of the midbrain periaqueductal gray. J. Neurophysiol. 72, 1337–1356 (1994).

    CAS  PubMed  Google Scholar 

  17. Tschida, K. et al. A specialized neural circuit gates social vocalizations in the mouse. Neuron 103, 459–472.e4 (2019).

    CAS  PubMed Central  PubMed  Google Scholar 

  18. Michael, V. et al. Circuit and synaptic organization of forebrain-to-midbrain pathways that promote and suppress vocalization. eLife 9, e63493 (2020).

    CAS  PubMed Central  PubMed  Google Scholar 

  19. Chen, J. et al. Flexible scaling and persistence of social vocal communication. Nature 593, 108–113 (2021).

    ADS  CAS  PubMed Central  PubMed  Google Scholar 

  20. Okobi, D. E. Jr, Banerjee, A., Matheson, A. M. M., Phelps, S. M. & Long, M. A. Motor cortical control of vocal interaction in neotropical singing mice. Science 363, 983–988 (2019).

    ADS  CAS  PubMed  Google Scholar 

  21. Burkhard, T. T., Westwick, R. R. & Phelps, S. M. Adiposity signals predict vocal effort in Alston’s singing mice. Proc. R. Soc. B 285, 20180090 (2018).

    PubMed Central  PubMed  Google Scholar 

  22. Banerjee, A., Phelps, S. M. & Long, M. A. Singing mice. Curr. Biol. 29, R190–R191 (2019).

    CAS  PubMed  Google Scholar 

  23. Zheng, D. J. et al. Mapping the vocal circuitry of Alston’s singing mouse with pseudorabies virus. J. Comp. Neurol. 530, 2075–2099 (2022).

    PubMed  Google Scholar 

  24. Evarts, E. V. Relation of pyramidal tract activity to force exerted during voluntary movement. J. Neurophysiol. 31, 14–27 (1968).

    CAS  PubMed  Google Scholar 

  25. Fee, M. S., Kozhevnikov, A. A. & Hahnloser, R. H. R. Neural mechanisms of vocal sequence generation in the songbird. Ann. N. Y. Acad. Sci. 1016, 153–170 (2004).

    ADS  PubMed  Google Scholar 

  26. Margoliash, D. Acoustic parameters underlying the responses of song-specific neurons in the white-crowned sparrow. J. Neurosci. 3, 1039–1057 (1983).

    CAS  PubMed Central  PubMed  Google Scholar 

  27. Fetz, E. E. Are movement parameters recognizably coded in the activity of single neurons? Behav. Brain Sci. 15, 679–690 (1992).

    Google Scholar 

  28. Churchland, M. M. et al. Neural population dynamics during reaching. Nature 487, 51–56 (2012).

    ADS  CAS  PubMed Central  PubMed  Google Scholar 

  29. Shenoy, K. V., Sahani, M. & Churchland, M. M. Cortical control of arm movements: a dynamical systems perspective. Annu. Rev. Neurosci. 36, 337–359 (2013).

    CAS  PubMed  Google Scholar 

  30. Long, M. A. & Fee, M. S. Using temperature to analyse temporal dynamics in the songbird motor pathway. Nature 456, 189–194 (2008).

    ADS  CAS  PubMed Central  PubMed  Google Scholar 

  31. Glaze, C. M. & Troyer, T. W. Temporal structure in zebra finch song: implications for motor coding. J. Neurosci. 26, 991–1005 (2006).

    CAS  PubMed Central  PubMed  Google Scholar 

  32. Tang, L. S. et al. Precise temperature compensation of phase in a rhythmic motor pattern. PLoS Biol. 8, e1000469 (2010).

    PubMed Central  PubMed  Google Scholar 

  33. Elmaleh, M., Kranz, D., Asensio, A. C., Moll, F. W. & Long, M. A. Sleep replay reveals premotor circuit structure for a skilled behavior. Neuron 109, 3851–3861.e4 (2021).

    CAS  PubMed Central  PubMed  Google Scholar 

  34. Yamaguchi, A., Gooler, D., Herrold, A., Patel, S. & Pong, W. W. Temperature-dependent regulation of vocal pattern generator. J. Neurophysiol. 100, 3134–3143 (2008).

    PubMed Central  PubMed  Google Scholar 

  35. Banerjee, A., Egger, R. & Long, M. A. Using focal cooling to link neural dynamics and behavior. Neuron 109, 2508–2518 (2021).

    CAS  PubMed Central  PubMed  Google Scholar 

  36. Crapse, T. B. & Sommer, M. A. Corollary discharge across the animal kingdom. Nat. Rev. Neurosci. 9, 587–600 (2008).

    CAS  PubMed Central  PubMed  Google Scholar 

  37. Houde, J. F. & Chang, E. F. The cortical computations underlying feedback control in vocal production. Curr. Opin. Neurobiol. 33, 174–181 (2015).

    CAS  PubMed Central  PubMed  Google Scholar 

  38. Eliades, S. J. & Wang, X. Neural substrates of vocalization feedback monitoring in primate auditory cortex. Nature 453, 1102–1106 (2008).

    ADS  CAS  PubMed  Google Scholar 

  39. Eliades, S. J. & Miller, C. T. Marmoset vocal communication: behavior and neurobiology. Dev. Neurobiol. 77, 286–299 (2017).

    PubMed  Google Scholar 

  40. Vallentin, D. & Long, M. A. Motor origin of precise synaptic inputs onto forebrain neurons driving a skilled behavior. J. Neurosci. 35, 299–307 (2015).

    CAS  PubMed Central  PubMed  Google Scholar 

  41. Economo, M. N. et al. Distinct descending motor cortex pathways and their roles in movement. Nature 563, 79–84 (2018).

    ADS  CAS  PubMed  Google Scholar 

  42. Network, B. I. C. C. A multimodal cell census and atlas of the mammalian primary motor cortex. Nature 598, 86–102 (2021).

    Google Scholar 

  43. Warriner, C. L., Fageiry, S. K., Carmona, L. M. & Miri, A. Towards cell and subtype resolved functional organization: mouse as a model for the cortical control of movement. Neuroscience 450, 151–160 (2020).

    CAS  PubMed  Google Scholar 

  44. Merel, J., Botvinick, M. & Wayne, G. Hierarchical motor control in mammals and machines. Nat. Commun. 10, 5489 (2019).

    ADS  PubMed Central  PubMed  Google Scholar 

  45. Lopes, G. et al. A robust role for motor cortex. Front. Neurosci. 17, 971980 (2023).

    PubMed Central  PubMed  Google Scholar 

  46. Ebbesen, C. L. & Brecht, M. Motor cortex—to act or not to act? Nat. Rev. Neurosci. 18, 694–705 (2017).

    CAS  PubMed  Google Scholar 

  47. Wang, J., Narain, D., Hosseini, E. A. & Jazayeri, M. Flexible timing by temporal scaling of cortical responses. Nat. Neurosci. 21, 102–110 (2018).

    CAS  PubMed  Google Scholar 

  48. Remington, E. D., Egger, S. W., Narain, D., Wang, J. & Jazayeri, M. A dynamical systems perspective on flexible motor timing. Trends Cogn. Sci. 22, 938–952 (2018).

    PubMed Central  PubMed  Google Scholar 

  49. Mello, G. B., Soares, S. & Paton, J. J. A scalable population code for time in the striatum. Curr. Biol. 25, 1113–1122 (2015).

    CAS  PubMed  Google Scholar 

  50. Paton, J. J. & Buonomano, D. V. The neural basis of timing: distributed mechanisms for diverse functions. Neuron 98, 687–705 (2018).

    CAS  PubMed Central  PubMed  Google Scholar 

  51. Xu, M., Zhang, S. Y., Dan, Y. & Poo, M. M. Representation of interval timing by temporally scalable firing patterns in rat prefrontal cortex. Proc. Natl Acad. Sci. USA 111, 480–485 (2014).

    ADS  CAS  PubMed  Google Scholar 

  52. Remington, E. D., Narain, D., Hosseini, E. A. & Jazayeri, M. Flexible sensorimotor computations through rapid reconfiguration of cortical dynamics. Neuron 98, 1005–1019.e5 (2018).

    CAS  PubMed Central  PubMed  Google Scholar 

  53. De Corte, B. J., Akdogan, B. & Balsam, P. D. Temporal scaling and computing time in neural circuits: should we stop watching the clock and look for its gears? Front. Behav. Neurosci. 16, 1022713 (2022).

    CAS  PubMed Central  PubMed  Google Scholar 

  54. Mita, A., Mushiake, H., Shima, K., Matsuzaka, Y. & Tanji, J. Interval time coding by neurons in the presupplementary and supplementary motor areas. Nat. Neurosci. 12, 502–507 (2009).

    CAS  PubMed  Google Scholar 

  55. Renoult, L., Roux, S. & Riehle, A. Time is a rubberband: neuronal activity in monkey motor cortex in relation to time estimation. Eur. J. Neurosci. 23, 3098–3108 (2006).

    PubMed  Google Scholar 

  56. Saxena, S., Russo, A. A., Cunningham, J. & Churchland, M. M. Motor cortex activity across movement speeds is predicted by network-level strategies for generating muscle activity. eLife 11, e67620 (2022).

    CAS  PubMed Central  PubMed  Google Scholar 

  57. Stroud, J. P., Porter, M. A., Hennequin, G. & Vogels, T. P. Motor primitives in space and time via targeted gain modulation in cortical networks. Nat. Neurosci. 21, 1774–1783 (2018).

    CAS  PubMed Central  PubMed  Google Scholar 

  58. Pachitariu, M., Steinmetz, N., Kadir, S., Carandini, M. & Kenneth D. H. Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels. Preprint at bioRxiv https://doi.org/10.1101/061481 (2016).

  59. Rossant, C. et al. Spike sorting for large, dense electrode arrays. Nat. Neurosci. 19, 634–641 (2016).

    CAS  PubMed Central  PubMed  Google Scholar 

  60. Jenks, G. F. The data model concept in statistical mapping. Int. Yearb. Cartogr. 7, 186–190 (1967).

    Google Scholar 

Download references

Acknowledgements

We thank S. Shea, F. Albeanu, W. Bast, J. del Rosario, H. Sloin and members of the Long and Banerjee laboratories for comments on earlier versions of the manuscript. A. Paulson provided technical assistance. Funding was provided by the National Institutes of Health grant R01 NS113071 (M.A.L., S.D.), Simons Collaboration on the Global Brain (M.A.L., S.D.), Searle Scholars Program (A.B.), Klingenstein–Simons fellowship (A.B.) and the Simons Foundation Junior Fellows Program (A.B.). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

A.B. and M.A.L. conceived the project; A.B., F.C., S.D. and M.A.L. designed the methodology. A.B. and M.A.L. performed the investigation. A.B., F.C., S.D. and M.A.L. visualized the project. A.B., S.D. and M.A.L. acquired funding. S.D. and M.A.L. administered and supervised the project. A.B. and M.A.L. wrote the original draft of the manuscript; A.B., F.C., S.D. and M.A.L. contributed to writing, review and editing.

Corresponding authors

Correspondence to Arkarup Banerjee or Michael A. Long.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Neuroscience thanks Steffen Hage, Mehrdad Jazayeri and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Determination of significant note- and song-related responses.

(a,b) Example neurons with (a, Cell #19) and without (b, Cell #1) significant note modulation. Spike rasters (top) and spike probability density plots (bottom) for example neurons whose activity profiles have been linearly warped to a common note duration (onsets indicated by green lines). Each row represents the warped spike raster of a neuron aligned to the beginning of a sequence of three notes; responses are sorted based on the original duration of the first note produced in this sequence from longest (top) to shortest (bottom). At right, polar plots describing the tuning of spike times with respect to the relative phase of note production. Dashed lines indicate a uniform distribution. (c) Histogram of note modulation (see Methods) for significantly note-modulated neurons (n = 111) compared with the same analysis applied to nonsinging epochs. (d) Song modulation analysis protocol. Neural activity for songs (black rectangles) are aligned either to their starts (top) or stops (bottom). The evaluation window (song epoch) begins and ends two seconds before and after the shortest song duration of that session. (e) The relative firing rate difference between the song-aligned spiking activity and a nonsinging period for a modulated (left, Cell #169) and unmodulated (right, Cell #187) neuron. 72 song trials are represented by separate lines for each neuron. Significance determined by bootstrap resampling (***: p < 0.01 two-sided test, n.s.: not significant). (f) Histogram of song modulation values (see Methods) for all song modulated neurons (n = 133) and those not modulated by song (n = 242).

Extended Data Fig. 2 Further characterization of note-related responses.

(a and b) Spike times of two example neurons – Cell #5 (a) and Cell #19 (b) - linearly warped to a common note duration (onsets indicated by dashed lines). Each row represents the warped spike raster of a neuron aligned to the beginning of a sequence of three notes; responses are sorted based on the original duration of the first note produced in this sequence from longest (top) to shortest (bottom). Examples in (a) and (b) relate to analyses in Fig. 2e. (c) Spiking activity corresponding to note timing for an example neuron (Cell #180 from Mouse #4). For visualization, analysis was restricted to notes of prespecified durations (top: 55 to 60 ms; bottom: 150 to 200 ms, sample note sonograms provided for each range). For long note durations, robust spiking emerges near the end of each note. Green and red ticks indicate the onset and offset of notes, respectively. (d) Spiking activity from Cell #180 linearly warped to a common note duration (onsets indicated by dashed lines). Timing shifted by a best fit latency of 110 ms (sensory-like shift). (e) Summary plot (extension from Fig. 2f) showing the latency resulting in the maximum note modulation strength for all note modulated neurons (n = 111). Gray symbols represent cases that are not significantly different from zero, and red (n = 23) and blue (n = 2) symbols represent points with sensory and motor offsets, respectively.

Extended Data Fig. 3 Song-modulated neurons.

(a-c) Spiking raster plots for three example neurons – Cell #19 (a), Cell #5 (b), and Cell #176 (c) – across all trials. At right, a peri-song time histogram (PSTH) for song blocks representing the shortest and longest songs in the session (indicated by cyan and magenta vertical lines on right of raster plots). Black curve represents temporally compressed PSTHs from longest trials as a comparison. The magnitude of compression was chosen to match the ratio of the song durations. (d-f) Spike times of neurons in (a-c) after temporally warping to the beginning and end of song. Green and red lines indicate the onset and offset of songs, respectively.

Extended Data Fig. 4 Quantifying neural scaling as a function of behavior across categories.

(a,b) The ratio of the neural scaling factor (Sneural) to the behavioral scaling factor (Sbehavioral) with neurons grouped across different categories, namely reactive versus spontaneous singing (a) and animal ID (b). Each dot represents a comparison of similarly timed trial blocks (n = 4 - 21) for an individual neuron; quantifications denote median ± MAD. For the analysis shown in (b), Mouse #1 was limited in its total number of song trials per session with our original stringent criterion for significance threshold (p = 0.01), which prevented us from testing our hypothesis. We therefore relaxed this threshold across all animals to p = 0.05, which enabled a direct comparison of scaling factor. In all cases for both (a) and (b), the median neural:behavioral scaling ratio overlapped with 1, which denotes perfect co-variance between the duration of the song and the underlying OMC neural dynamics. See Methods and Fig. 3h for further information concerning how these parameters were calculated. Two-sided tests were used unless specified otherwise.

Extended Data Fig. 5 Cross-validation of hierarchical clustering.

(a) Shown are the results of hierarchical clustering performed on the training (left) and test (right) set of trials sorted with respect to cluster affiliation (left). (b) Cross-validated firing rate profiles of the eight clusters evaluated on the training (solid) and the test (dashed) data set. (c) The ratio of the neural scaling factor (Sneural) to the behavioral scaling factor (Sbehavioral) with neurons grouped across different categories. Each dot represents a comparison of similarly timed trial blocks (n = 4–21) for an individual neuron; quantifications denote median ± MAD. In all cases, the median neural: behavioral scaling ratio overlapped with 1, which denotes perfect co-variance between the duration of the song and the underlying OMC neural dynamics. See Methods and Fig. 3h for further information concerning how these parameters were calculated.

Extended Data Fig. 6 Details of the computational model.

(a) Inferred weights (shown at left) for each song-modulated OMC neuron (shown in middle) which leads to a descending synaptic drive (shown at right) to the downstream note pattern generator. (b) An alternative implementation of the hierarchical model, in which the note pattern generator produces a song by combining an unscaled step-like input with a characteristic time-dependent adaptation. These inputs could be intrinsic to the pattern generator or could be inherited from a different brain area. In both cases, time-scaled OMC activity can interface with the existing note generating mechanism to produce adaptive behavioral variability. (c) In the absence of the OMC input, the note pattern generator can produce notes but loses flexibility resulting in songs with higher stereotypy, consistent with a partially autonomous motor control system.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Banerjee, A., Chen, F., Druckmann, S. et al. Temporal scaling of motor cortical dynamics reveals hierarchical control of vocal production. Nat Neurosci 27, 527–535 (2024). https://doi.org/10.1038/s41593-023-01556-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41593-023-01556-5

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing