Neural sequences are a fundamental feature of brain dynamics underlying diverse behaviours, but the mechanisms by which they develop during learning remain unknown. Songbirds learn vocalizations composed of syllables; in adult birds, each syllable is produced by a different sequence of action potential bursts in the premotor cortical area HVC. Here we carried out recordings of large populations of HVC neurons in singing juvenile birds throughout learning to examine the emergence of neural sequences. Early in vocal development, HVC neurons begin producing rhythmic bursts, temporally locked to a ‘prototype’ syllable. Different neurons are active at different latencies relative to syllable onset to form a continuous sequence. Through development, as new syllables emerge from the prototype syllable, initially highly overlapping burst sequences become increasingly distinct. We propose a mechanistic model in which multiple neural sequences can emerge from the growth and splitting of a common precursor sequence.
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Wikenheiser, A. M. & Redish, A. D. Hippocampal theta sequences reflect current goals. Nature Neurosci. 18, 289–294 (2015)
Pfeiffer, B. E. & Foster, D. J. Hippocampal place-cell sequences depict future paths to remembered goals. Nature 497, 74–79 (2013)
Dragoi, G. & Tonegawa, S. Preplay of future place cell sequences by hippocampal cellular assemblies. Nature 469, 397–401 (2011)
Davidson, T. J., Kloosterman, F. & Wilson, M. A. Hippocampal replay of extended experience. Neuron 63, 497–507 (2009)
Fujisawa, S., Amarasingham, A., Harrison, M. T. & Buzsáki, G. Behavior-dependent short-term assembly dynamics in the medial prefrontal cortex. Nature Neurosci. 11, 823–833 (2008)
Pastalkova, E., Itskov, V., Amarasingham, A. & Buzsáki, G. Internally generated cell assembly sequences in the rat hippocampus. Science 321, 1322–1327 (2008)
Eichenbaum, H. Time cells in the hippocampus: a new dimension for mapping memories. Nature Rev. Neurosci. 15, 732–744 (2014)
Harvey, C. D., Coen, P. & Tank, D. W. Choice-specific sequences in parietal cortex during a virtual-navigation decision task. Nature 484, 62–68 (2012)
Murakami, M., Vicente, M. I., Costa, G. M. & Mainen, Z. F. Neural antecedents of self-initiated actions in secondary motor cortex. Nature Neurosci. 17, 1574–1582 (2014)
Peters, A. J., Chen, S. X. & Komiyama, T. Emergence of reproducible spatiotemporal activity during motor learning. Nature 510, 263–267 (2014)
Tanji, J. Sequential organization of multiple movements: involvement of cortical motor areas. Annu. Rev. Neurosci. 24, 631–651 (2001)
Buzsáki, G. Neural syntax: cell assemblies, synapsembles, and readers. Neuron 68, 362–385 (2010)
Vogels, T. P., Rajan, K. & Abbott, L. F. Neural network dynamics. Annu. Rev. Neurosci. 28, 357–376 (2005)
Immelmann, K. in Bird Vocalizations (ed. Hinde, R. A. ) 61–74 (Cambridge Univ. Press, 1969)
Doupe, A. J. & Kuhl, P. K. Birdsong and human speech: common themes and mechanisms. Annu. Rev. Neurosci. 22, 567–631 (1999)
Mooney, R. Neural mechanisms for learned birdsong. Learn. Mem. 16, 655–669 (2009)
Konishi, M. Birdsong: from behavior to neuron. Annu. Rev. Neurosci. 8, 125–170 (1985)
Brainard, M. S. & Doupe, A. J. Translating birdsong: songbirds as a model for basic and applied medical research. Annu. Rev. Neurosci. 36, 489–517 (2013)
Hahnloser, R. H., Kozhevnikov, A. A. & Fee, M. S. An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature 419, 65–70 (2002)
Kozhevnikov, A. A. & Fee, M. S. Singing-related activity of identified HVC neurons in the zebra finch. J. Neurophysiol. 97, 4271–4283 (2007)
Long, M. A., Jin, D. Z. & Fee, M. S. Support for a synaptic chain model of neuronal sequence generation. Nature 468, 394–399 (2010)
Amador, A., Perl, Y. S., Mindlin, G. B. & Margoliash, D. Elemental gesture dynamics are encoded by song premotor cortical neurons. Nature 495, 59–64 (2013)
Fujimoto, H., Hasegawa, T. & Watanabe, D. Neural coding of syntactic structure in learned vocalizations in the songbird. J. Neurosci. 31, 10023–10033 (2011)
Prather, J. F., Peters, S., Nowicki, S. & Mooney, R. Precise auditory-vocal mirroring in neurons for learned vocal communication. Nature 451, 305–310 (2008)
Nottebohm, F., Stokes, T. M. & Leonard, C. M. Central control of song in the canary, Serinus canarius. J. Comp. Neurol. 165, 457–486 (1976)
Long, M. A. & Fee, M. S. Using temperature to analyse temporal dynamics in the songbird motor pathway. Nature 456, 189–194 (2008)
Aronov, D., Andalman, A. S. & Fee, M. S. A specialized forebrain circuit for vocal babbling in the juvenile songbird. Science 320, 630–634 (2008)
Simpson, H. B. & Vicario, D. S. Brain pathways for learned and unlearned vocalizations differ in zebra finches. J. Neurosci. 10, 1541–1556 (1990)
Ali, F. et al. The basal ganglia is necessary for learning spectral, but not temporal, features of birdsong. Neuron 80, 494–506 (2013)
Vallentin, D. & Long, M. A. Motor origin of precise synaptic inputs onto forebrain neurons driving a skilled behavior. J. Neurosci. 35, 299–307 (2015)
Zann, R. A. The Zebra Finch: A Synthesis of Field and Laboratory Studies (Oxford Univ. Press, 1996)
Liu, W. C., Gardner, T. J. & Nottebohm, F. Juvenile zebra finches can use multiple strategies to learn the same song. Proc. Natl Acad. Sci. USA 101, 18177–18182 (2004)
Tchernichovski, O., Mitra, P. P., Lints, T. & Nottebohm, F. Dynamics of the vocal imitation process: how a zebra finch learns its song. Science 291, 2564–2569 (2001)
Aronov, D., Veit, L., Goldberg, J. H. & Fee, M. S. Two distinct modes of forebrain circuit dynamics underlie temporal patterning in the vocalizations of young songbirds. J. Neurosci. 31, 16353–16368 (2011)
Veit, L., Aronov, D. & Fee, M. S. Learning to breathe and sing: development of respiratory-vocal coordination in young songbirds. J. Neurophysiol. 106, 1747–1765 (2011)
Tchernichovski, O. & Mitra, P. P. Towards quantification of vocal imitation in the zebra finch. J. Comp. Physiol. A 188, 867–878 (2002)
Glaze, C. M. & Troyer, T. W. Development of temporal structure in zebra finch song. J. Neurophysiol. 109, 1025–1035 (2013)
Saar, S. & Mitra, P. P. A technique for characterizing the development of rhythms in bird song. PLoS One 3, e1461 (2008)
Lipkind, D. et al. Stepwise acquisition of vocal combinatorial capacity in songbirds and human infants. Nature 498, 104–108 (2013)
Lipkind, D. & Tchernichovski, O. Quantification of developmental birdsong learning from the subsyllabic scale to cultural evolution. Proc. Natl Acad. Sci. USA 108 (Suppl. 3), 15572–15579 (2011)
Jin, D. Z., Ramazanog˘ lu, F. M. & Seung, H. S. Intrinsic bursting enhances the robustness of a neural network model of sequence generation by avian brain area HVC. J. Comput. Neurosci. 23, 283–299 (2007)
Li, M. & Greenside, H. Stable propagation of a burst through a one-dimensional homogeneous excitatory chain model of songbird nucleus HVC. Phys. Rev. E 74, 011918 (2006)
Jun, J. K. & Jin, D. Z. Development of neural circuitry for precise temporal sequences through spontaneous activity, axon remodeling, and synaptic plasticity. PLoS One 2, e723 (2007)
Fiete, I. R., Senn, W., Wang, C. Z. & Hahnloser, R. H. Spike-time-dependent plasticity and heterosynaptic competition organize networks to produce long scale-free sequences of neural activity. Neuron 65, 563–576 (2010)
Buonomano, D. V. A learning rule for the emergence of stable dynamics and timing in recurrent networks. J. Neurophysiol. 94, 2275–2283 (2005)
Gibb, L., Gentner, T. Q. & Abarbanel, H. D. Inhibition and recurrent excitation in a computational model of sparse bursting in song nucleus HVC. J. Neurophysiol. 102, 1748–1762 (2009)
Bertram, R., Daou, A., Hyson, R. L., Johnson, F. & Wu, W. Two neural streams, one voice: pathways for theme and variation in the songbird brain. Neuroscience 277, 806–817 (2014)
Kosche, G., Vallentin, D. & Long, M. A. Interplay of inhibition and excitation shapes a premotor neural sequence. J. Neurosci. 35, 1217–1227 (2015)
Goller, F. & Cooper, B. G. Peripheral motor dynamics of song production in the zebra finch. Ann. NY Acad. Sci. 1016, 130–152 (2004)
Ohno, S. Evolution by Gene Duplication (Springer-Verlag, 1970)
Tchernichovski, O., Nottebohm, F., Ho, C. E., Pesaran, B. & Mitra, P. P. A procedure for an automated measurement of song similarity. Anim. Behav. 59, 1167–1176 (2000)
Tchernichovski, O., Lints, T. J., Deregnaucourt, S., Cimenser, A. & Mitra, P. P. Studying the song development process: rationale and methods. Ann. NY Acad. Sci. 1016, 348–363 (2004)
Goller, F. & Daley, M. A. Novel motor gestures for phonation during inspiration enhance the acoustic complexity of birdsong. Proc. R. Soc. Lond. B 268, 2301–2305 (2001)
Rajan, R. & Doupe, A. J. Behavioral and neural signatures of readiness to initiate a learned motor sequence. Curr. Biol. 23, 87–93 (2013)
Mandelblat-Cerf, Y. & Fee, M. S. An automated procedure for evaluating song imitation. PLoS One 9, e96484 (2014)
Fee, M. S. & Leonardo, A. Miniature motorized microdrive and commutator system for chronic neural recording in small animals. J. Neurosci. Methods 112, 83–94 (2001)
Okubo, T. S., Mackevicius, E. L. & Fee, M. S. In vivo recording of single-unit activity during singing in zebra finches. Cold Spring Harb. Protoc. 2014, 1273–1283 (2014)
Fee, M. S., Kozhevnikov, A. A. & Hahnloser, R. H. Neural mechanisms of vocal sequence generation in the songbird. Ann. NY Acad. Sci. 1016, 153–170 (2004)
Hahnloser, R. H., Kozhevnikov, A. A. & Fee, M. S. Sleep-related neural activity in a premotor and a basal-ganglia pathway of the songbird. J. Neurophysiol. 96, 794–812 (2006)
Goldberg, J. H. & Fee, M. S. A cortical motor nucleus drives the basal ganglia-recipient thalamus in singing birds. Nature Neurosci. 15, 620–627 (2012)
Rieke, F. Spikes: Exploring the Neural Code (MIT Press, 1997)
Jarvis, M. R. & Mitra, P. P. Sampling properties of the spectrum and coherency of sequences of action potentials. Neural Comput. 13, 717–749 (2001)
Bokil, H., Andrews, P., Kulkarni, J. E., Mehta, S. & Mitra, P. P. Chronux: a platform for analyzing neural signals. J. Neurosci. Methods 192, 146–151 (2010)
Mitra, P. & Bokil, H. Observed Brain Dynamics (Oxford Univ. Press, 2008)
Oppenheim, A. V. & Schafer, R. W. From frequency to quefrency: a history of the Cepstrum. IEEE Signal Process. Mag. 21, 95–106 (2004)
Garst-Orozco, J., Babadi, B. & Ölveczky, B. P. A neural circuit mechanism for regulating vocal variability during song learning in zebra finches. eLife 3, e03697 (2014)
Leonardo, A. & Fee, M. S. Ensemble coding of vocal control in birdsong. J. Neurosci. 25, 652–661 (2005)
Ashmore, R. C., Wild, J. M. & Schmidt, M. F. Brainstem and forebrain contributions to the generation of learned motor behaviors for song. J. Neurosci. 25, 8543–8554 (2005)
Lim, Y., Shinn-Cunningham, B. & Gardner, T. J. Sparse contour representations of sound. IEEE Signal Process. Lett. 19, 684–687 (2012)
Markowitz, J. E., Ivie, E., Kligler, L. & Gardner, T. J. Long-range order in canary song. PLOS Comput. Biol. 9, e1003052 (2013)
Duda, R. O., Hart, P. E. & Stork, D. G. Pattern Classification 2nd edn (Wiley, 2001)
Kanji, G. K. 100 Statistical Tests 3rd edn (Sage Publications, 2006)
McDonald, J. H. Handbook of Biological Statistics 3rd edn (Sparky House Publishing, 2014)
Abbott, L. F. & Blum, K. I. Functional significance of long-term potentiation for sequence learning and prediction. Cereb. Cortex 6, 406–416 (1996)
Dan, Y. & Poo, M. M. Spike timing-dependent plasticity: from synapse to perception. Physiol. Rev. 86, 1033–1048 (2006)
Fee, M. S. & Goldberg, J. H. A hypothesis for basal ganglia-dependent reinforcement learning in the songbird. Neuroscience 198, 152–170 (2011)
Fiete, I. R., Hahnloser, R. H., Fee, M. S. & Seung, H. S. Temporal sparseness of the premotor drive is important for rapid learning in a neural network model of birdsong. J. Neurophysiol. 92, 2274–2282 (2004)
Charlesworth, J. D., Tumer, E. C., Warren, T. L. & Brainard, M. S. Learning the microstructure of successful behavior. Nature Neurosci. 14, 373–380 (2011)
Ravbar, P., Lipkind, D., Parra, L. C. & Tchernichovski, O. Vocal exploration is locally regulated during song learning. J. Neurosci. 32, 3422–3432 (2012)
Walton, C., Pariser, E. & Nottebohm, F. The zebra finch paradox: song is little changed, but number of neurons doubles. J. Neurosci. 32, 761–774 (2012)
We thank M. Wilson, J. Kornfeld, M. Jazayeri, S. Seung, N. Ji, and M. Stetner for comments on the manuscript. Funding to M.S.F. was provided by the NIH (grant no. R01DC009183) and by the Mathers Foundation, to T.S.O. by the Nakajima Foundation and Schoemaker Fellowship, to E.L.M. by the Department of Defense (DoD) through the National Defense Science & Engineering Graduate Fellowship (NDSEG) Program, and to H.L.P. by the National Science Foundation (NSF) Graduate Research Fellowship Program (no. DGE-114747) and the NSF Integrative Graduate Education and Research Traineeship (no. 0801700). The modelling work was begun in the Methods in Computational Neuroscience course at the Marine Biological Laboratory (NIH grant number R25MH062204).
The authors declare no competing financial interests.
Extended data figures and tables
Extended Data Figure 1 Bursting and syllable-locked activity in HVC projection neurons of juvenile birds.
a, Range of bird ages at which songs were classified at different developmental stages (Spearman’s rank correlation between age and stage ρ = 0.61; red line indicates the median, box indicates the 25–75 percentile, and whiskers indicate 10–90 percentile; n = 12, 13, 18 and 6 birds, respectively; n = 39, 135, 565 and 378 neurons, respectively). b, Interspike-interval (ISI) distributions (mean ± s.e.m.) of HVC projection neurons that exhibited spiking during singing, at three stages of vocal development (n = 38, 130, 922 neurons). ISI distributions computed with logarithmic binning show bimodal structure: the peak around 3–5 ms indicates inter-spike intervals within bursts, and a broader peak around 100–400 ms indicates intervals between bursts (dashed line indicates the 30 ms threshold used for defining a burst; dotted line indicates peak). Note the refractory period below 1 ms. c, Burst width (top) and firing rate during bursts (bottom) as a function of developmental stage (median ± quartiles; n = 39, 135, 565, 378 and 32 neurons, respectively; **P < 0.01, ***P < 0.001 post-hoc comparison with adult stage). d–i, Syllable-onset-aligned raster plots and histograms for neurons recorded during the subsong stage. Syllables are sorted from bottom to top by increasing syllable duration (blue lines indicate syllable offset). d, Neuron that did not exhibit significant locking to subsong syllable onsets (RA-projecting neuron, HVCRA; 50 dph; bird 7). e, Another neuron in the same bird (same neuron as in Fig. 1a; HVCRA; 51 dph). f, g, Two projection neurons recorded in a different subsong bird (both X-projecting neurons, HVCX; 47 and 48 dph, respectively; bird 9). Note different latencies of bursting. h, i, Two projection neurons recorded in a different subsong bird (both HVCX; 47 and 44 dph, respectively; bird 10). j, k, Syllable-onset-aligned raster plots and histograms showing strong locking to protosyllables (bird 2). j, For the same neuron as in Fig. 1b (HVCRA; 62 dph). k, For another neuron (HVCRA; 65 dph). l, m, Two neurons recorded in the motif stage (bird 8). l, Neuron locked just after syllable onset (HVCX neuron; 61 dph). m, Same neuron as in Fig. 1c (HVCRA; 68 dph) showing locking late in the song syllable. n, Population raster of 14 neurons, aligned to protosyllable onsets (56–59 dph; bird 1).
a–d, Examples of HVC projection neurons showing rhythmic activity during non-rhythmic song. a, Bird 2, HVCRA neuron, 57 dph. b, Bird 12, HVCX, 53 dph. c, Bird 12, HVCRA, 57 dph. d, Syllable onset-aligned raster plot for neuron shown in c. Syllables are sorted in order of increasing duration (bottom to top; blue line indicates syllable offset). Also shown (top) is the onset-aligned spike histogram. Note multiple rhythmic bursts during long syllables. Scale bars: panels a–c, 1 mV, 100 ms. e–l, Bout-related activity of HVC projection neurons. e, Bout-onset neuron (HVCX; 44 dph; bird 11). f, Bout-onset aligned histogram and raster plot for the neuron shown in panel e. g, Bout-onset aligned histogram and raster plot for the neuron shown in Fig. 1d. h, Distribution of pre-bout-onset latencies for all bout-onset neurons (n = 187 neurons, 32 birds). i, Bout-offset neuron (HVCX; 61 dph; bird 1). j, Bout-offset aligned histogram and raster plot for the neuron shown in panel i. k, Bout-offset aligned histogram and raster plot for the neuron shown in Fig. 1e. l, Distribution of post-bout-offset latencies for all bout-offset neurons (n = 149 neurons, 32 birds). Vertical scale bars in panels e and i, 0.5 mV. m–o, Developmental progression of HVC activity analysed separately for HVCRA and HVCX neurons. m, Fraction of neurons temporally locked to syllables (mean ± s.e.m.; HVCRA: 9, 22, 83, 54 and 10 neurons analysed at each stage, respectively; HVCX: 27, 91, 376, 244 and 22 neurons analysed at each stage, respectively). n, Fraction of neurons that exhibited rhythmic bursts (HVCRA: 9, 22, 83, 54 and 10 neurons, respectively; HVCX: 27, 91, 376, 244 and 22 neurons, respectively). o, Mean period of HVC rhythmicity as a function of song stage (HVCRA: 0, 16, 50, 41 and 7 neurons, respectively; HVCX: 3, 41, 245, 189, 18 neurons, respectively). Of the 14 comparisons between HVCRA and HVCX neurons shown in panels m–o, only the period of HVC rhythm (panel o) during the motif stage showed significant difference between the cell types (P < 0.05 with Bonferroni correction). p–r, Analysis of probabilistic participation in rhythmic activity during protosyllables. p, Distribution of the fraction of protosyllables on which spiking occurred (n = 70 neurons). In contrast to the highly reliable bursting of HVC projection neurons in adult birds19,20,21,22, we found that neurons in the protosyllable stage participated probabilistically (mean: 53% of protosyllables; triangle symbol). q, Histogram of the coefficient of determination r2 for protosyllable participation across simultaneously recorded pairs of neurons (median r2 = 0.072; n = 11 pairs; see Methods). r, Histogram of mutual information for protosyllable participation across simultaneously recorded pairs of neurons (median 0.056 bits; n = 11 pairs; see Methods). s, t, Analysis of burst coverage by HVC projection neuron bursts. s, Summary histogram of the covered fraction for all analysed syllables (n = 20 syllables, 4 birds). Note that 17/20 syllables had a covered fraction higher than 90%. t, Covered fraction analysed for 20 syllables for which raster plots are shown in the main or Extended Data figures. Vertical grey bars indicate 95% confidence interval (2.5–97.5 percentile) of coverage expected for random uniform shuffling of the observed bursts (see Methods). Note that for all syllables, the observed coverage is within the confidence interval for randomly shuffled bursts. These findings suggest that, even for the three syllables with coverage less than 90% (indicated with red square symbol), the lower coverage was consistent with undersampling due to the smaller number of recorded neurons in these birds. Regarding two models of HVC coding: our findings bear on several recent models of song representation in HVC. One earlier model hypothesizes that HVC bursts provide timing signals to drive premotor activity19,58,67 and to control the temporal precision of learning76,77,78,79. This model implies a continuous, though not necessarily uniform, coverage of HVC bursts throughout song, as observed in our data. Overall, given the very large number of HVC neurons in each hemisphere80 (>104), our measurements are consistent with a continuous representation of timing signals throughout song syllables. Another model of HVC coding has emphasized the finding that bursts may occur more often at particular times in the song, related to ‘gestures’ in the vocal control parameters22. Our finding that bursts are more concentrated around syllable onsets early in vocal development suggests that HVC may generate protosyllables as primitive gestures that serve as a scaffold on which later song syllables develop33. During development, HVC activity appears to evolve such that, as a population, bursts occur more uniformly throughout song syllables (Fig. 2c), while the activity of individual neurons becomes sparser and more precise. At the same time, one might imagine that vocal gestures become more complex and precise as syllables develop into their adult forms. In this view, the emergence of sequential activity in HVC may be viewed to drive an increasingly complex sequence of gestures.
Extended Data Figure 3 Increase in the period of HVC rhythmicity during alternating syllable differentiation.
All data are from bird 1. a, Paired recording of a shared neuron (top; HVCRA) and a β-specific neuron (bottom; HVCX; 69 dph). b, Paired recording of a shared neuron (top; HVCX) and a C-specific neuron (bottom; HVCX; 110 dph). c, Neuron switching between shared and specific spiking (HVCX; 63 dph). d, Same neuron as in c, switching from specific to shared spiking. e, A different neuron switching from shared to specific spiking (HVCp; 68 dph). Scale bars in panels a–e, 0.5 mV, 200 ms. f–i, Inter-burst interval (IBI) distributions for shared and specific neurons. f, For the neuron in Fig. 3c recorded during protosyllable stage. g, For the shared neuron shown in the top panel of Fig. 3f. h, For the β-specific neuron shown in Fig. 3d. i, For a γ-specific neuron (not shown). j, Population summary of the ‘most-probable IBI’ for the neurons recorded during the protosyllable stage (n = 9), and during the emergence of syllables β and γ (62–72 dph; shared neurons, n = 22; specific neurons, n = 83). Note that shared neurons had the same ‘most-probable IBI’ as neurons recorded during the protosyllable stage. Neurons exhibiting an increased burst period by skipping cycles of an underlying rhythm were also observed in birds 3, 4 and 6 (see Extended Data Figs 8f–h and 9f, h).
a–d, Latencies of shared neuron bursts, colour-coded by cell type: HVCRA (red square), HVCX (blue circle), and HVCp (green diamond). a, Neurons in bird 1 shared between syllables β and γ (from Fig. 3) recorded during the early (top) and late (bottom) stages of syllable differentiation. Note strong correlation of burst latencies (early, r = 0.91, P < 0.001; late, r = 0.87, P = 0.005). b, Neurons in bird 1 shared between syllables D and B (Extended Data Fig. 7) during the early and late stages of syllable differentiation (top, early r > 0.99, P < 0.001; bottom, late r > 0.99, P < 0.001). c, Neurons in bird 2 shared between syllables β and α (Fig. 4h) during the early and late stages (top, early r > 0.99, P < 0.001; bottom, late r > 0.99, P < 0.001). A shared neuron that had two peaks during the syllable α is shown with an ‘x’ symbol; this point was not included in the calculation of correlation. d, Neurons in bird 4 shared between ‘b2’ and ‘d1’ (Extended Data Fig. 9l) during early stage (top, r = 0.89, P < 0.001; neurons that burst in the first part of ‘b’ (‘b1’) are shown with ‘x’ symbol, and were not included in the calculation of correlation). Neurons in bird 4 shared between syllables ‘c’ and ‘d2’ (Extended Data Fig. 9n) during early stage (bottom, r = 0.98, P < 0.001). Regarding bias: as a population, shared neurons exhibited a broad range of selectivity for emerging syllable types—some were equally active for both syllable types while others showed higher activity in one syllable than the other (‘bias’; see Methods). e, Raw spike data (top left) and instantaneous firing rate (bottom left) for a neuron shared between syllables β and γ (HVCp; 68 dph, bird 1). Also shown is the syllable-onset-aligned raster plot (bottom right) and histogram (top right) showing similar peak firing rates for both syllables (low bias; bias = 0.07). f, Spike data (left) and syllable-onset-aligned raster plot and histogram (right) for a high-bias shared neuron showing higher peak firing rate for syllable β than γ (bias = 0.63; HVCRA; 68 dph, bird 1). g, Low-bias shared neuron (bias = 0.06; HVCX; 69 dph, bird 2). h, High-bias shared neuron showing higher peak firing rate for syllable β than α (bias = 0.55; HVCX; 68 dph, bird 2). i, Scatter plot of the peak firing rates during two different syllable types, quantified by the height of the peak in the syllable-aligned spike histogram. Each dot is a neuron; shared neurons shown in cyan; neurons near the diagonal have low bias. Specific neurons are coloured according to the associated syllable and appear near the axes. j, Distribution of the bias for shared neurons (cyan) and specific neurons (magenta). Bias ranged from 0, representing equal activity, to 1, representing activity exclusive to either one of the syllables (see Methods). Specific neurons exhibited a bias tightly clustered around one (0.96 ± 0.011, mean ± s.d.). In contrast, shared neurons exhibited a broad range of bias (0.28 ± 0.22). These observations suggest that individual shared neurons can exist in a state intermediate between ‘specific’ and ‘shared’—perhaps reflecting a gradual process by which shared neurons become specific. Scale bars for panels e–h, 0.5 mV, 100 ms. Insets in panels f and h show zoom of bursts indicated by an asterisk; scale bar: 5 ms.
While emerging syllable types gradually differentiate acoustically, some parts of different emerging syllable types may be acoustically quite similar. We wondered if shared neurons are only active at these times within emerging syllables at which no acoustic differentiation has yet occurred—that is, at times when the emerging syllable types are acoustically identical. To test this possibility, we analysed the trajectories of acoustic features of emerging syllable types around the times of shared neuron bursts. a, Shared HVCRA neuron recorded in bird 1 during alternation between emerging syllable types β and γ (same neuron as Fig. 3e). b, c, Average spectrogram (sparse contour representation; see Methods) computed for syllables β and γ, centred on a 50 ms window immediately after the burst in each syllable. d, Song amplitude as a function of time for syllables β (red) and γ (blue), relative to burst time. Lines show average across all syllable renditions on which the neuron was active. Shading around lines shows s.e.m. (for this and several other examples, s.e.m. is too small to be visible). e, Spectral centre of gravity as a function of time for syllables β (red) and γ (blue). f, Distribution of projected samples for syllables β (red) and γ (blue), computed by projecting the 8-dimensional vector of spectral features onto a line that yields maximum separability between the two syllables. This distribution is computed at each time (1 ms steps) in the 50-ms analysis window after burst time. Shown is the distribution at t = 25 ms. g, d-prime analysis of separability of projected samples for syllables β and γ. The value of d′ is computed as a function of time (1 ms steps; red trace). Also shown is the 95% confidence interval (grey band) computed from surrogate data sets with randomized labels. Dashed horizontal line shows the 95 percentile of the distribution of peak values of d′ in the surrogate data set (identified in the 10–40 ms window). h–j, Acoustic analysis for three additional HVCRA neurons (analogous to panels a–g). k, Plot of d′ trajectories for all shared HVCRA neurons. Significant d′ values (above the 95 percentile of peak values) are shown in red. Non-significant values shown as grey lines. l, Same as panel k but for shared HVCX neurons. m, Population summary of mean d′ (averaged over the presumptive premotor window 10–40 ms after burst time). Each symbol represents a different shared neuron and each column indicates a different syllable pair. Analysis is shown separately for each neuron type: HVCRA neurons (green circles) and HVCX neurons (blue squares). Neurons with no significant acoustic differences are indicated with black symbols. n, Cumulative distribution of mean d′ for shared HVCRA neurons (green; n = 11) and shared HVCX neurons (blue; n = 36). Only neurons with significant d′ metric are included in the cumulative. No significant difference was observed between neuron types (P = 0.1). Scale bars for panels a, h, i, j are 0.5 mV, 100 ms. Summary of properties of HVCRA and HVCX shared neurons: Shared neurons were found in similar proportion across both HVCRA and HVCX neurons (19% and 28%, respectively; P = 0.08; averaged over all developmental stages) and shared neurons of both cell types exhibited the property that bursts have similar latencies during the shared syllables (Extended Data Fig. 4a–d). As shown above, for both neuron types, we observed shared neurons that burst at times where there was a significant acoustic difference between the shared syllables. These findings suggest that both projection neuron types participate in shared neural sequences, and that these shared sequences occur during acoustically distinguishable parts of the emerging syllables.
(Same bird as in Fig. 4). a, Song examples throughout song development. Panels from top to bottom: first, subsong (49 dph); second, emergence of protosyllable α from subsong (60 dph); third, appearance of bout-onset element ε (63 dph); fourth, fusion of ε with first α to form new syllable β (67 dph); fifth and sixth, acoustic differentiation of β and α, and incorporation with γ into song motif CBA (70, 90 dph); seventh, tutor song. b, Schematic of syllable formation (same as Fig. 4a), inferred by tracking backward in development the adult syllables C, B and A. Early on, protosyllable (labelled α) is produced rhythmically. The first protosyllable in each bout fuses with a brief bout-onset vocal element ε to form a new emerging syllable type β. Both α and β undergo subsequent acoustic differentiation to form adult syllables A and B, respectively. (An additional syllable γ emerges at bout onset to form adult syllable C). c, Developmental time course of the occurrence probability of different syllable types at bout onsets (mean ± s.e.m.). d, Syllable duration distribution showing three non-overlapping peaks (67 dph). Coloured bars indicated syllable duration ranges used for syllable labelling. This separation of durations allowed automatic determination of syllable identity. e, Pitch goodness trajectories of syllables α (red) and β (blue) at three stages of vocal development (median ± quartiles; n = 100 syllables per day). Black bar, region used to compute data in Fig. 4b. f, Example of a neuron active during both syllables α and β (HVCRA; 69 dph). Note that the activity of this neuron during syllable α was weak, and did not quite reach our statistical criterion for being a ‘shared’ neuron.
All data are from bird 1 (same bird as in Fig. 3). a, Song examples during the emergence of syllables B and D from a common precursor syllable β, which had undergone earlier differentiation from a protosyllable α. Panels from top to bottom: first (70 dph), After the initial differentiation of the protosyllable into β and γ (at ~62 dph), the bird produced a rhythmic alternation of these two syllables, and the alternating sequence was reliably preceded at bout onsets by a short vocal element ε (ε-β-γ-β-γ-β-γ…). Note that the first repetition of β in each bout (labelled D) is acoustically identical to later repetitions (labelled B); second (80 dph), the first repetition of β in the bout (syllable D) undergoes differential acoustic refinement compared to later repetitions (syllable B); third, syllable B, C and D, together with bout-onset element ε, crystallize into adult motif EDCB (90 dph), that approximately matches the tutor motif (bottom panel). b, Schematic of syllable formation. c, Scatter plot of the mean Wiener entropy showing differential acoustic refinement of syllables B (orange) and D (green) through development (n = 100 syllables of each type per day; horizontal jitter added to improve data visibility). d, Wiener entropy trajectory of syllables B and D at three stages of vocal development (median ± quartiles; n = 100 syllables of each type per day). Black bar indicates region used to compute data in panel c. e, Population raster of 60 neurons early in syllable differentiation showing shared (top) and specific (bottom) sequences. f, Same as e, but for 70 neurons recorded late in differentiation of D and B. Evidence for an incomplete splitting of a neural sequence: the pattern of shared and specific neurons observed for these syllables is quite similar to what would be expected in our model during an early/intermediate stage of splitting (Fig. 5c or Extended Data Fig. 10c). Of particular note in this bird is the large fraction of shared neurons between B and D that remained in the later recordings (panel f), compared to the smaller fraction of shared neurons at late stages in syllables B and C of the same bird (Fig. 3h). However, syllables B and C differentiated from parent syllable α early in development (~60 dph, Fig. 3b), while D and B differentiated from β at a much later stage (~80 dph, panel c). One might speculate that the splitting of D and B may have failed to reach completion before the bird reached adulthood, possibly preventing further splitting. Neural evidence (shared burst sequence) for hierarchical differentiation was also observed in bird 6 (data not shown). Neural evidence (shared burst sequence) for bout-onset differentiation was also observed in bird 5 (data not shown).
All data are from bird 3. Neural recordings from this bird support the view that, in the ‘motif strategy’, new syllables emerge from a common rhythmic protosequence. a, Song examples during the emergence of a motif. Panels from top to bottom: first, subsong (37 dph); second, the song began to acquire rhythmic ‘protosyllable’ modulation in song amplitude around 9 Hz (45 dph); third, over the next five days (47–51 dph), this bird acquired a reliable pattern of 4–5 acoustically distinct elements (‘syllables’), each generated in a different cycle of the 9 Hz rhythm (48 dph); fourth, the acoustic structure in each syllable was gradually refined, resulting in an excellent match to the tutor song even at this early age (51 dph); fifth, tutor song. b, Scatter plot of syllable duration and pitch goodness (n = 300 syllables per day; colour coded according to syllable identity in panel a). c, Development of song rhythmicity quantified as the spectrum of the sound amplitude38. Gray shade indicates the pass band for the filter used in phase segmentation. d, Phase segmentation based on the rhythmicity in the song. Top, song spectrogram with phase segments (grey boxes). Middle, sound amplitude (blue) and band-pass filtered sound amplitude (magenta). Syllable segmentation based on the sound amplitude is shown as white boxes. Bottom, instantaneous phase (green) of the band-pass filtered sound amplitude. Phase segments (grey boxes) are obtained by detecting threshold crossing (black dotted line) of the instantaneous phase. e, Rhythmic neuron (protosyllable stage; HVCp; 45 dph). f, Neuron shared between syllables A and B (HVCRA; 48 dph). g, Neuron shared between B and E (HVCX; 49 dph). h, Population raster aligned to the five-syllable motif for neurons that were significantly locked to any syllable (n = 10 neurons). Each motif and associated spike times were time-warped using a piecewise linear method67 based on syllable onsets and offsets. i, Histogram of the absolute phase difference between the two syllables for all shared neurons (n = 8 neurons; mean phase difference: 41 ± 33.9 deg, mean ± s.d.). j, Cumulative distribution of the mean absolute phase difference after randomizing burst identity (red dotted line indicates P = 0.05 threshold for significance; red triangle indicates observed mean absolute phase difference, P = 0.013). Statistical details in Methods. Scale bars for panels e–g, 30 dB, 0.3 mV, 200 ms.
Extended Data Figure 9 Another example of shared burst sequences during the emergence of new syllable types.
All data are from bird 4. a, Song examples during the emergence of a motif ABCDF. Note the nearly simultaneous emergence of multiple syllable types in nearly fixed order (52 dph). Tutor song shown at the bottom. Phase segments are shown above the spectrogram for song at 43 dph. b, Top, song rhythm spectrum calculated in the protosyllable stage (43 dph) and after motif formation (59 dph). Note the pronounced peaks at 5 Hz and 10 Hz in both stages. Bottom, syllable duration distribution in the protosyllable stage (43 dph) and after motif formation (59 dph) showing two peaks. At 43 dph, the peak at 70 ms indicates short protosyllables corresponding to one cycle of the 10 Hz rhythm, and the peak at 140 ms indicates longer syllables formed by two protosyllables fused across two cycles of the 10 Hz rhythm (doubled protosyllables). Example doubled protosyllables are seen in the first and third syllables of panel a, 43 dph. (Note that boxes at the top of this panel indicate phase segments, not syllable boundaries). c, Hypothesized mechanism of motif construction, based on the examination of acoustic structure and analysis of neural burst sequences (see below). Notably, in this bird, the majority of syllables emerged nearly simultaneously in a relatively fixed order, consistent with a ‘motif strategy.’ d, Scatter plots of syllable duration versus mean spectral centre of gravity at four stages of vocal development (each dot represents a single syllable; n = 500 syllables per day; colour coded according to syllable identity in panel a). e, Neuron bursting at the 10 Hz protosyllable rhythm (HVCX; 48 dph). Phase segments shown above spectrogram. f, Top, neuron bursting at the 10 Hz rhythm (HVCX; 49 dph). Bottom, simultaneous recording of a neuron bursting on alternate cycles of the 10 Hz rhythm (HVCRA). g, Shared neuron bursting on second half of syllable ‘b’ (labelled b2) and first half of syllable ‘d’ (labelled d1) (HVCRA; 51 dph). h, Shared neuron bursting rhythmically on ‘b1’, ‘c’ and second half of ‘d’ (d2) (HVCRA; 51 dph). i, Shared neuron bursting on ‘a’ and ‘d1’ (HVCRA; 58 dph). j, Shared neuron bursting on ‘d2’, ‘e’, and last part of ‘f’ (HVCRA; 57 dph). k, Population raster of 12 neurons that were significantly locked to protosyllable onsets (48–49 dph). Protosyllables were identified using phase segmentation (see Methods). l, Population raster showing neurons active during syllables ‘b’ and/or ‘d’, recorded early in syllable differentiation. Neurons shared between ‘b’ and ‘d1’ are grouped at top. Neurons specific for ‘b’ are grouped next, and neurons specific for ‘d’ are grouped at bottom. m, Same as panel l, but for neurons recorded later in development. n, Population rasters showing neurons active during syllables ‘c’ and/or ‘d’, recorded early in development. o, Same as m, but for neurons recorded later in development. Scale bars for panels e–j, 0.5 mV, 200 ms. Neural evidence for hypothesized mechanism of motif construction: based on an analysis of acoustic signals and neural recordings, we have formulated a hypothesis for how the song of this bird developed, from the formation of the protosyllable to the emergence of the complete motif. We hypothesize that the fundamental protosyllable element corresponds to the prominent 10 Hz peak in the rhythm spectrum and the 70 ms peak in the duration distribution (panel b). This view is further supported by the presence of neurons in the protosyllable stage that generate rhythmic bursts at 10 Hz (panels e and f; 11/18 neurons were rhythmic, 5/11 rhythmic neurons exhibited periodicity at 10 Hz), and the existence of a burst sequence during the protosyllable (panel k). In this bird, the rhythmic protosyllables differentiated nearly simultaneously, at an early age (52 dph, panel a), into a complete sequence of distinct syllables that subsequently formed the adult song, suggesting this bird employed a ‘motif strategy.’ One complication of this simple view is that there may have been an early partial splitting of the short protosyllable α into two ‘daughter’ protosyllables α1 and α2, which alternated to produce the elements of the final motif (panel c). Two lines of evidence based on neural activity support this view: First, many neurons recorded at an early stage (<50 dph) exhibited a prominent 5 Hz periodicity in their rhythmic bursting, (panels f and h; 6/11 rhythmic neurons), rather than the expected 10 Hz period (panels e and f, top trace). This observation led us to consider the possibility that the 100 ms neural sequence, corresponding to the dominant 10 Hz protosyllable rhythm, underwent a partial splitting during the protosyllable stage—similar to the alternating differentiation described for bird 1 (Fig. 3; Extended Data Fig. 4). This would result in two distinct alternating protosyllable sequences α1 and α2 (panel c). Such splitting would effectively double the period of the protosyllable rhythm, and would account for the ‘doubled’ protosyllables and the 5 Hz peak in the rhythm spectrum (panel b). The existence of short and doubled protosyllables led us to hypothesize that the short syllables of the adult motif (‘a’, ‘c’, and ‘e’) arose from the short protosyllables, while long adult syllables (‘b’ and ‘d’, and possibly ‘f’) arose from the doubled protosyllables (panel c). Early syllable ‘e’ is later dropped by the juvenile, although it appears in the tutor song. Furthermore, the analysis of shared sequences (panels l–o) revealed a predominance of shared neurons between syllable elements in alternating cycles of the underlying 10 Hz rhythm. For example, shared neurons were observed between syllables ‘a’, ‘b2’ and ‘d1’ (panel i for neuron shared between ‘a’ and ‘d1’; panels g and l for neurons shared between ‘b2’ and ‘d1’). Shared neurons were also observed between syllables ‘b1’, ‘c’, and ‘d2’ (panel h for neuron shared between ‘b1’, ‘c’, and ‘d2’; panel n for neurons shared between ‘c’ and ‘d2’). In contrast, many fewer shared neurons were observed between neighbouring cycles of the underlying rhythm, although examples of this can be found (panel j).
a–d, Bout-onset differentiation results from activation of bout-onset seed neurons (blue arrow) followed by rhythmic activation of protosyllable seed neurons (red arrow). Network diagrams show (a, b) protosyllable formation and (c, d) splitting of chains specific for bout-onset syllable β and specific for later repetitions of the protosyllable α (blue and red, respectively; shared neurons: black). e–h, Model of simultaneous formation of multiple syllable types into an entire motif (‘motif strategy’). e, f, Protosyllable seed neurons (magenta lines) were activated rhythmically to form a protosequence. g, Seed neurons were then divided into three sequentially activated subgroups, resulting in the rapid splitting of the protosequence into three daughter sequences. In intermediate stages (panel g), individual neurons exhibited varying degrees of specificity and sharedness for the emerging syllable types. h, After learning, the population of neurons was active sequentially throughout the entire ‘motif,’ but individual neurons were active during only one of the resulting syllables, forming three distinct non-overlapping sequences. i–k, Network diagrams and raster plots showing an example of the formation of a new syllable chain at bout onset. In the network diagrams, seed neurons are indicated within magenta boxes, and bout-onset seed neurons and protosyllable seed neurons are indicated by blue and red arrows, respectively. Neurons specific for each emerging syllable type (ε and α) are coloured blue and red, respectively. The three panels represent the early protosyllable stage, the late protosyllable stage, and the final stage. The training protocol is similar to that for bout-onset differentiation (panels a–d), except that protosyllable seed neurons are driven more strongly throughout the learning process. As a result, protosyllable seed neurons did not become outcompeted by the growing bout-onset chain. Strong activation of the protosyllable seed neurons also terminated activity in the bout-onset chain through fast recurrent inhibition, thus preventing further growth of the bout-onset chain, as occurs in bout-onset differentiation. Regarding the role of chain splitting in the formation of new syllable types: in our model, we envision that the formation of daughter chains in HVC is translated into the emergence of new syllable types is as follows. During the splitting process, as two distinct sequences of specific neurons develop, their downstream projections can be independently modified67,77 such that each of the emerging chains of specific neurons can drive a distinct pattern of downstream motor commands, allowing distinct acoustic structure in the emerging syllable types. Such differential acoustic refinement is consistent with the previous behavioural observation that the altered acoustic structure of new syllables emerges in place, without moving or reordering sound components (‘sound differentiation in situ’)33. This model naturally explains the apparent ‘decoupling’ of shared projection neuron bursts from acoustic structure in the vocal output—that is, the fact that the bursts of shared neurons become associated with two distinct acoustic outputs during the differentiation of two syllable types (Extended Data Fig. 5). Specifically, during syllable differentiation, a shared neuron participates with different ensembles of neurons during each of the emerging sequences, and these different ensembles can drive different vocal outputs.
This file contains the Script to reproduce Figure 5 a-d and Extended Data Figure 10 a-d. (TXT 55 kb)
Video showing network progression over development for the alternating differentiation shown in Fig. 5a-d. Network diagrams are plotted, and text indicates the developmental stage (subsong, protosyllable stage, or splitting stage), and iteration number. (MP4 2274 kb)
Video showing network progression over development for the bout-onset differentiation shown in Extended Data Fig. 10a-d. Network diagrams are plotted, and text indicates the developmental stage (subsong, protosyllable stage, or splitting stage), and iteration number. (MP4 2717 kb)
About this article
Cite this article
Okubo, T., Mackevicius, E., Payne, H. et al. Growth and splitting of neural sequences in songbird vocal development. Nature 528, 352–357 (2015). https://doi.org/10.1038/nature15741
Current Biology (2020)
An evolving perspective on the dynamic brain: Notes from the Brain Conference on Dynamics of the brain: Temporal aspects of computation
European Journal of Neuroscience (2020)
Philosophical Transactions of the Royal Society B: Biological Sciences (2020)
Nature Communications (2020)
Proceedings of the National Academy of Sciences (2020)