The units of a song


Exactly when motor-planning neurons function to produce a bird's song is debatable. New data suggest that bursts of activity in these cells mark sudden changes in the commands to the vocal organ. See Article p.59

What is the basic unit of speech? The word? The syllable? The phoneme? This question has been vexing speech and language researchers for decades, and similar questions have challenged those who study songbirds. Whereas behavioural evidence1 supports the grouping of songs into 100–250-millisecond vocalizations called syllables, neurophysiological data suggest2 that the premotor areas at high levels in the hierarchy of motor neurons in the brain act more like a clock, providing a continuous stream of activity on a 10-millisecond timescale. On page 59 of this issue, Amador et al.3 reconcile these data by providing evidence that the song code generated by motor neurons of zebra finches (Taeniopygia guttata) is indeed broken into discrete 'gestures', which are significantly shorter than song syllables.Footnote 1

The study has roots in two research programmes that started at opposite ends of the motor-coding problem. One group studied the highest levels of the motor system, in which sensory signals about a song's acoustics change the song motor program during learning. The researchers discovered4 that, for every rendition of the bird's song, individual neurons produce short bursts of activity with incredible regularity and precision. They also demonstrated5 a remarkable correspondence between the motor activity that was recorded when the bird was singing and the auditory activity that resulted from playing the bird's song back to it when it was asleep.

The other team investigated how sound is generated by the avian vocal organ, the syrinx. They developed a simplified biophysical model of the syrinx with two dynamic parameters: the pressure in the bird's air sac and the spring-like tension on a vibrating membrane controlled by the muscles surrounding the syrinx. Analysis of the model showed that small changes in pressure and tension can lead to output that is a passable imitation of the sounds produced by several species of songbird6,7. This work also suggested that, to sing, birds may not need precise control over a large ensemble of muscles. Rather, two basic signals may suffice, as long as the signals are controlled in a temporally precise manner.

Combining their previous approaches, the two research programmes now come together. Amador et al. focus on a high-level cluster of neurons called the HVC, which is essential for singing, but — in terms of synaptic connections between neurons — is the most distant from the syrinx. They recorded the activity of individual HVC cells either while the birds sang or during playback of the bird's own song while it slept. They also tuned the syrinx model to reproduce each bird's song. By defining a vocal gesture as a period of time when both the pressure and tension parameters were either unchanged or strictly increasing or decreasing, they could divide the song into a sequence of distinct gestural units.

On aligning the neural and behavioural data, the authors found that activity bursts in HVC neurons occurred at specific time points in the song, namely at the boundaries between gestures. The results suggest that the gesture — which is longer than a burst but shorter than a syllable — might be the basic unit of song production.

This finding contrasts with the reigning view of the motor code for birdsong that was originally developed2,8 to account for the precise bursting activity of HVC neurons (Fig. 1). Finding no clear relationship between burst timing and the division of song into syllable-base units, researchers proposed that the HVC acted more like a clock: bursting in one set of HVC neurons triggered a burst in the next set, forming a continuous set of 'ticks' throughout the song.

Figure 1: The clock and gesture hypotheses.

In the premotor cluster of HVC neurons, which is essential for singing, each neuron produces a single burst of activity (bars) precisely locked to the song output. Recordings are possible from only a few neurons (red) in any given bird. a, It was proposed that the unrecorded neurons (open) are continuously active throughout the song, acting like a clock to pace the song output. b, By building a model of the bird's vocal organ, Amador et al.3 produce a new set of 'sheet music' for the song that specifies the motor commands needed to make any given sound. They find that every burst they recorded fell near a transition point between gestures (start times for notes in the sheet), suggesting that song is encoded as a series of distinct units.

Although the clock and gesture hypotheses lead to different views of the motor code for song, it is entirely possible that whereas bursting activity in HVC neurons tends to align with gesture transitions, a sufficient number of HVC neurons is active throughout each gesture to sustain clock-like functionality. Because ruling out this variation on the clock hypothesis would require demonstrating a negative — that there are no HVC neurons active during gestures — the debate over the status of the two hypotheses will probably linger for some time.

Amador and colleagues' results also contain a deeper mystery, the resolution of which may yield insight into how a bird learns its song (Fig. 2). The mystery stems from their observation that the average delay between an HVC burst and its associated gesture transition was near zero milliseconds. However, neural signals in the HVC must be relayed through several stages before they can alter the contraction of respiratory and syringeal muscles, a process estimated to take 20 milliseconds8. Thus, the bursts recorded during singing occur too late to actually cause gesture transitions. Similarly, the sound signal that arrives at the bird's ears has to traverse several synapses, causing an estimated delay of 15 milliseconds, before a sensory representation of it is registered in the HVC. This means that the bursts recorded during sleep, which align to sound with a zero-millisecond delay, occur too early to be caused by the auditory detection of a gesture transition.

Figure 2: The singer under investigation3.


The zebra finch Taeniopygia guttata.

Although we cannot yet expect definitive answers to the question of how high-level motor representations determine the control signals for song production, the syringeal- modelling approach pursued by Amador et al. provides both a method for breaking the song down into its basic units and evidence that HVC bursts are related to specific events in a bird's song. With a better understanding of the basic units, these results provide a foundation for understanding how birds learn to string these pieces back together to produce a whole song.


  1. 1.

    *This article and the paper under discussion3 were published online on 27 February 2013.


  1. 1

    Cynx, J. J. Comp. Psychol. 104, 3–10 (1990).

  2. 2

    Hahnloser, R. H. R., Kozhevnikov, A. A. & Fee, M. S. Nature 419, 65–70 (2002).

  3. 3

    Amador, A., Perl, Y. S., Mindlin, G. B. & Margoliash, D. Nature 495, 59–64 (2013).

  4. 4

    Yu, A. C. & Margoliash, D. Science 273, 1871–1875 (1996).

  5. 5

    Dave, A. S. & Margoliash, D. Science 290, 812–816 (2000).

  6. 6

    Mindlin, G. B. & Laje, R. The Physics of Birdsong (Springer, 2005).

  7. 7

    Laje, R., Gardner, T. J. & Mindlin, G. B. Phys. Rev. E 65, 051921 (2002).

  8. 8

    Fee, M. S., Kozhevnikov, A. A. & Hahnloser, R. H. Ann. NY Acad. Sci. 1016, 153–170 (2004).

Download references

Author information



Corresponding author

Correspondence to Todd W. Troyer.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Troyer, T. The units of a song. Nature 495, 56–57 (2013).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.