FIGURE 3. Distributions of the number of occurrences of Pfam protein domains (blue squares) in the genome of the yeast Saccharomyces cerevisiae, and of words (red diamonds) in Shakespeare's Romeo and Juliet, in both cases sorted in rank order from left to right.

From the following article:

The language of genes

David B. Searls

Nature 420, 211-217(14 November 2002)

doi:10.1038/nature01255

BACK TO ARTICLE

The most frequently occurring domains and words are labelled. In both cases (and in many other genomes and texts) the curves are good fits to a power-law distribution known as Zipf's law, which relates the frequency to the inverse of the rank.

Figures & Tables index
BACK TO ARTICLE