The art of musical performance lies largely in nuance — in making notes longer or shorter than they are written, or in shaping their dynamics, articulation or pitch. Performers don't generally have explicit theories of these things, as it's all done by ear. But these often subtle changes to the written score are responsible for a great deal of what makes music memorable, moving and meaningful — and they can be measured.

This combination of cultural meaning and measurability makes musical performance a productive example of the relationship between science and the humanities. When musicology came into being in the nineteenth century, it was modelled on philology, the study of ancient texts. For this reason, musicologists have tended to think of music as a form of writing. But much of what performers do, and what listeners respond to, falls between the notes as musicologists construe them. This is where science comes in.

Measurements cannot capture cultural values, but people listening to music respond to specific sounds. These sounds are amenable to scientific study, providing insights into the cultural values they embody. I focus on classical piano performance, but my claim is more general: to understand music as performance, we must use scientific and humanities approaches in tandem.

Stick to the plot

Two of the most important aspects of musical performance are the shaping of the tempo and the dynamics. Tempo shaping is the lengthening or shortening of notes or phrases and is measured by extracting beat durations from sound. Dynamics shaping is the patterning of loud and soft notes, to create one-off accents or waves of increase and decrease. It can be extracted as a continuously varying value or as a series of discrete values associated with individual notes.

Musicologists and psychologists have generally focused on how such data relate to the structure of music, drawing on traditional, notation-based analytical methods. For instance, reading by means of a kind of reverse engineering from the performance back to the composition, they have shown how performers use various combinations of speed change and dynamic accents to underline structural breaks or bring out important points.

Credit: D. PARKINS

Line graphs of tempo and dynamics are hard to relate to the music we hear, but over the past two years software has been developed that incorporates these graphs within a music visualization program so that they scroll past a cursor as one listens. Other limitations to this type of approach are less tractable. If performance is analysed in terms of the score-based structure, one is deaf to aspects of the performance that have nothing to do with what is written down. In effect, this assumes that the point of performance is to reproduce a meaning that is already there on the printed page, but any jazz or pop performance demonstrates what an inadequate approach that is.

There is a further, more subtle, problem. Try dancing to a Chopin mazurka and it soon becomes clear that concert evocations of dance music have much more extravagant shaping than music that is for actual dancing. A tempo graph would show this, and so says something about the music one experiences. Its shape, however, is the result of several distinct factors. To understand what is going on, we need to break the data down into their component parts. The question is what those parts might be.

Musical movement

In the early 1990s, Henkjan Honing and Peter Desain suggested that the shaping of both tempo and dynamics in classical music performance can be explained in terms of three main components: note-to-note shaping, the composer's 'pulse', and hierarchical phrase arching. The third of these refers to the way performers get faster and louder as they play into a phrase, and softer and quieter as they come out of it, giving the music a kind of breathing quality. It is often seen in nineteenth-century piano music, such as Chopin's. It is hierarchical in that such patterns can be found at multiple levels — such as 2, 4, 8 and even 16 bars. It is widely seen as part of what it means to play 'musically', that is to say expressively and meaningfully.

Musicologists tend to be suspicious of such generalizations. What is considered 'musical' has varied throughout history, as have practices of performance. My team at the AHRC Research Centre for the History and Analysis of Recorded Music (CHARM) in London recently analysed phrase arching in recordings of Chopin's Mazurka Op. 63 No. 3 going back to 1923. We measured how much shaping of phrases occurred through tempo and dynamics, and how far these variables were correlated. We found that, for this piece at least, both tempo and dynamic phrasing were present in the earliest recordings, but that they began to be closely coordinated with one another and with the composed phrasing only after the Second World War. Different performers achieved this coordination in various ways, but the effect was a streamlined style that was still expressive, albeit less personal and subjective than pre-war interpretations.

This finding shows how the nature of what is regarded as musicality has changed. What was assumed to be a general, perhaps hard-wired, quality turns out to be specific to a given time and place. Indeed, the very idea of 'expressive' performance, defined in terms of nuance, assumes that the purpose of music is to convey subjective feeling — an idea foreign to Japanese taiko drumming, for example. That is why these studies concentrate almost exclusively on Western classical music.

Mechanical musicians

It should be possible to apply a fully functioning model of expressive performance to a digital score that computers and synthesizers can read, in which every crotchet (US quarter note) is the same length. This would result in an expressive and meaningful sound output that is mechanically generated but sounds human, reducing creativity to a set of rules.

You might even fantasize about music being mixed in the same way as paint.

Some programs do this on the basis of note-to-note shaping and composer's 'pulse'. Director Musices is a free, research-oriented program that uses a set of rules, for example that longer notes are louder than shorter ones, or that a run of ascending notes gets faster; there is also a simple phrase-arching function. These rules can be switched on or off, applied more or less strongly, or even inverted. The commercial program SuperConductor is based on the idea that for every major composer there is a characteristic signature (or pulse) for each beat in the bar, and allows you to 'sculpt' a file into an expressive performance. A new program currently in beta testing, Silbert MOR Expressive Performance, automatically generates human-like performances and is targeted at professionals such as the producers of TV commercials who want music without the trouble and expense of paying musicians or licensing recordings.

Such tools are a far cry from the 'humanize' functions of sequencing programs, which merely introduce random variation. And if, as the CHARM research suggests, performance styles can be modelled to specific times or places, variable settings could enable one to reproduce the style of particular pianists, effectively generating new recordings of pieces they never played. You might even fantasize about music being mixed in the same way as paint. Instead of buying recordings off the peg, like standard paint ranges, you could customize them: 50% Vladimir Horowitz, 45% Arturo Michelangeli and 5% Jean-Marc Luisada, say.

But a musical performance isn't a pot of paint. It is a human action carried out at a certain time and place, normally in the presence of others and marked by the contingencies of the occasion. The same applies to recordings, even when they owe more to studio manipulation than real-time performance. We still hear them as traces of events. Remove the communication from music and it rapidly becomes as pointless as it would be to spend time in the virtual world Second Life if there were no real people behind the avatars and speech bubbles.

A search for meaning

Performance, then, is more than the communication of structural information about musical works. The very act of performance generates meaning, whether the musician is Madonna, Miles Davis or Glenn Gould.

In a 2002 concert performance of Mazurka Op. 63 No. 3 filmed at the Théâtre des Champs-Elysées in Paris, Russian pianist Grigory Sokolov performs virtuosity as much as he performs Chopin: his hands often fly up after a particularly telling note, providing an idiosyncratic balletic correlate to the sound. His performance makes perfect sense on CD, but seeing it adds further meaning. The striking quality of public display in his playing is redolent of the cavernous spaces of modern concert halls and the star quality of the international virtuoso. He enacts exceptionality.

Such evocative flourishes communicate cultural values that cannot be measured. Sokolov, like many Russian pianists, uses particularly strong phrase arching. His expressiveness is structurally generated, rather than primarily located at the note-to-note level as with pre-war pianists, so he is free to indulge in extravagant choreography without losing the musical thread.

Quantitative analysis reveals how phrase arching facilitates Sokolov's virtuosity. Without a systematic approach we would have much less idea about how these effects are created. It would be hard to quantify how Sokolov's style relates to that of other pianists.

Programs such as Director Musices or the CHARM model of phrase arching can be used to capture the general qualities of performance. The mark of their success is the extent to which they account for the variance in performance data. Such applications can also be used to study a particular performance, such as Sokolov playing Op. 63 No. 3. Here the interest lies in the pattern of discrepancies between the model and the performance. The focus is on the unique features, and the criterion of success is: how far the model guides the ear towards an awareness of these qualities, resulting in a process of engaged listening and critical interpretation. Used thus, deterministic models of performance expression do not undermine values of human creativity, but locate them more accurately.

Scientific measurement and cultural approaches to performance can be linked usefully. But this is a marriage of complementary approaches, rather than a convergence towards a unified discipline.

Further reading and listening

Clarke, E. F. in Empirical Musicology (eds Clarke, E. F. & Cook, N.) 77–102 (Oxford University Press, 2004).

Desain, P. & H. Honing, H. Contemp. Music Rev. 7, 123–138 (1993).

Todd, N. Music Percept. 3, 33–57 (1985).

Sokolov, G. Live in Paris. Naive DR 2108 AV 127 (2003).

Director Musices: www.speech.kth.se/music/performance/download

SuperConductor: senticcycles.org/superconductor/page1.html

Silbert MOR Expressive Performance: www.silpormusic.com/Products/PlugIns.asp

See other essays in the Science & Music series at http://www.nature.com/nature/focus/scienceandmusic.