Published online 24 January 2011 | Nature | doi:10.1038/news.2011.40
Corrected online: 25 January 2011


How words get the message across

Languages are adapted to deliver information efficiently and smoothly.

communicateThe length of words is related to how much information they

Longer words tend to carry more information, according to research by a team of cognitive scientists.

It's a suggestion that might sound intuitively obvious, until you start to think about it. Why, then, the difference in length between 'now' and 'immediately'? For many years, linguists have tended to believe that the length of a word was associated with how often it was used, and that short words are used more frequently than long ones. This association was first proposed in the 1930s by the Harvard linguist George Kingsley Zipf1.

Zipf believed that the relationship between word length and frequency of use stemmed from an impulse to minimize the time and effort needed for speaking and writing, as it means we use more short words than long ones. But Steven Piantadosi and colleagues at the Massachusetts Institute of Technology in Cambridge say that, to convey a given amount of information, it is more efficient to shorten the least informative — and therefore the most predictable — words, rather than the most frequent ones.

Zipf's original association is roughly correct, as implied by how much more often 'a', 'the' and 'is' are used in English than, say, 'extraordinarily'. And this relationship of length to use seems to hold up in many languages. Because written and spoken length are generally similar, it applies to both speech and text.

But after analysing word use in 11 different European languages, Piantadosi and colleagues found that word length was more closely correlated with their information content than with how often they are used. They describe their results in the Proceedings of the National Academy of Sciences2.

"This is a landmark study", says linguist Roger Levy of the University of California at San Diego. "Our understanding of the relationship between word frequency and length has remained relatively static since Zipf's discoveries," he says, and he feels that this new study may now supply "the largest leap forward in 75 years" in understanding how the evolution of words is governed by the efficiency with which they can be used to communicate.

Method madness

Measuring the information content of a word isn't easy, especially because it can vary depending on the context. But Piantadosi and colleagues make the assumption that the more predictable a word is, the less informative it is. So the word 'nine' in 'A stitch in time saves nine' contains less information than it does in the phrase 'The word that you will hear is nine', because in the first case it is highly predictable - when it comes, it doesn't significantly add to the information already in the phrase.

The MIT group devised a method for estimating the information content of words in digitized texts by looking at how it is correlated with — and thus predictable from — the preceding words. For just a single preceding word, Piantadosi explains, "we count up how often all pairs of words occur together in sequence, such as 'the man', 'the boy', 'a man', 'a tree' and so on. Then we use this count to estimate the probability of a word conditioned on the previous word — or more generally, the probability of any word conditioned on any preceding sequence of a given number of words." According to information theory, the information content is then proportional to the negative logarithm of this probability.


However, physicist Damián Zanette of the Centro Atómico Bariloche in San Carlos de Bariloche, Argentina, who has studied Zipf-type relationships in linguistics, is not persuaded that the MIT group's method accurately captures the real information content of a word in context. This, he says, is typically determined by several hundred surrounding words, not just a few3.

Piantadosi and colleagues suggest that the relationship of word length to information content might not only make it more efficient to convey information linguistically but also make language cognition a smoother ride for the reader or listener. If shorter and briefer words carry less information, then the density of information throughout a phrase or sentence will be smoothed out, so that it is delivered at a roughly steady rate rather than in lumps. In this way, the results suggest how the structure of language might aid communication.

Surprising though it may seem, some linguists, such as Noam Chomsky, have suggested that communication might not be the primary purpose of language - that it might, for example, be primarily about establishing social relations. Yet according to cognitive scientist Florian Jaeger at the University of Rochester in New York, these new results "suggest that communication is a sufficiently important aspect of language to shape it over time". 


An editing error inadvertantly identified Piantadosi's research group as being based at Harvard, rather than MIT. The text has been corrected to reflect this.
  • References

    1. Zipf, G. The Psychobiology of Language (Routledge, 1936).
    2. Piantadosi, S. T., Tily, H. & Gibson, E. Proc. Natl Acad. Sci. USA doi:10.1073/pnas.1012551108 (2011).
    3. Montemurro, M. A. & Zanette, D. H. Adv. Complex Syst. 13, 135-153 (2010). | Article | ISI


If you find something abusive or inappropriate or which does not otherwise comply with our Terms or Community Guidelines, please select the relevant 'Report this comment' link.

Comments on this thread are vetted after posting.

  • #62279

    My housemate is French, fluent in English and Spanish, thinks in French but sleep talks in a gibberish mix of all three! I think when he falls asleep on the couch, it's pretty damn confusing.

Commenting is now closed.