Q&A: Musical intelligence

Journal name:
Nature
Volume:
474,
Page:
35
Date published:
DOI:
doi:10.1038/474035a
Published online

Eduardo Reck Miranda is a composer and leading researcher in artificial intelligence in music, based at the University of Plymouth, UK. A revised version of his Sacra Conversazione — five movements for string orchestra, percussion and electronics — will be performed on 9 June in London. He explains what music can tell us about speech, physiology and cognition.

Sacra Conversazione

Queen Elizabeth Hall, London.
9 June 2011 at 7.30 p.m.
On air and online on BBC Radio 3 on 3 July.

In Sacra Conversazione you synthesize 'artificial words' by splicing sounds from different languages. What inspired this?

The composition focuses on the non-semantic communicative power of the human voice, which is conveyed mostly by the melodic contour, rhythm, speed and loudness of vocal sounds. There is evidence that the non-semantic content of speech, such as emotional intent, is processed by the brain faster than semantic content: humans seem to have evolved a 'fast lane' for this non-semantic content. I believe that this aspect of our mind is central to our capacity for making and appreciating music.

L. RUSSELL

How did you create the artificial words?

I started by combining single utterances from several languages. I used more than a dozen — as diverse as Japanese, English, Spanish, Farsi, Thai and Croatian — to form hundreds of composite 'words', as if I were creating the lexicon for a new artificial language. It was a painstaking job, using sophisticated speech-synthesis methods. Yet I was surprised that only about one in five of these new 'words' sounded natural.

Why didn't the assembled words sound realistic?

The problem was in the transitions between the original segments. For example, the transition from, say, Thai utterance A to Japanese utterance B did not sound right. But the transition of the former to Japanese utterance C was acceptable. I have come to believe that the main reason is physiological. When we speak, our vocal mechanism articulates a number of muscles simultaneously. So if we synthesize artificial utterances that are physiologically implausible, the brain is reluctant to accept them. Human-voice perception — and, I suspect, auditory perception in general — is very much influenced by the physiology of vocal production.

What other approaches did you try?

I tried to synthesize voices using a physical model of the vocal tract. The model has more than 20 variables, each of which roughly represents a particular muscle. But I found it extremely difficult to produce decent utterances with this model. All the same, I used some of these sounds in the composition: they sounded voice-like but not word-like. This explains why artificial speech technology is still so reliant on splicing and smoothing methods.

How did you then turn these sounds into music?

Synthesis and manipulation of voice are only the cogs, nuts and bolts. Music happens when one starts to assemble the machine. It is hard to describe how I composed Sacra Conversazione, but inspiration played a big part. Creative inspiration is beyond the ability of computers, yet finding its origin is the holy grail of the neurosciences. How can the brain draw up and execute plans on our behalf implicitly, without telling us?

L. RUSSELL

Composer Eduardo Reck Miranda (back left, at computer) synthesizes music from human voice sounds.

Is the evolving understanding of music cognition opening up possibilities in music composition?

Yes, to a limited extent. But progress will probably emerge from the reverse: new possibilities in musical composition will contribute to the development of such understanding. Cognitive neuroscience methods force scientists to narrow the concept of music, whereas I want to broaden it. But the approaches are not incompatible: each can inform and complement the other.

What are you working on now?

I am working on a human–computer interface in which the user can control musical parameters solely with the brain (see http://go.nature.com/ieds9g). And I am orchestrating plots of spiking neurons and the behaviour of artificial-life models for Sound to Sea, a large-scale symphonic piece for orchestra, church organ, percussion, choir and mezzo-soprano soloist. The piece will be premiered in 2012.

What do you hope audiences will feel when listening to your work?

My main aim is to compose music that is appreciated as a piece of art rather than as a challenging auditory experiment. If the music makes people think about the relationship between sound and language, I will be even happier. Music is not merely entertainment.

Additional data