Segmental duplications are a common feature of the human genome and they can be very difficult to sequence. But they should not be ignored. Through careful analysis of one set of segmental duplications, Matthew Johnson and colleagues have discovered a new human gene family. Furthermore, this gene family seems to have been subjected to powerful positive selection.

The authors focused on a 20-kb repeated segment that is confined to a 15-Mb region of chromosome 16. There are 15 copies of this segment, which have very high levels of sequence similarity. To study the evolutionary history of the duplicated segments, the authors identified the orthologous sequences in a series of primates. Their analysis showed that the segment is only present in one or two copies in Old World monkeys, such as the baboon. By contrast, in great apes such as gorillas, which are more closely related to humans, there are 9–30 copies of the segment. Overall, the segment seems to have been duplicated recently and independently in several primate lineages, after the divergence of humans and great apes from Old World monkeys.

When the sequence of the human segment was used in a database search, the authors found that an expressed sequence lurks within the repeated segment, although no homologues could be found in other organisms. Surprisingly, sequence comparisons of the human repeats showed that the putative protein-coding regions are five times more divergent compared with the non-coding regions of the repeat. This indicates that the gene might be under positive selection for adaptive mutations. In support of this, the ratio of nonsynomous to synonymous amino-acid changes was significantly greater than 1, and on the basis of comparisons with the primate sequences, evidence of positive selection could be found during the divergence of the great ape and human lineages. Despite the relatively recent duplication events, the gene has undergone major evolutionary change.

Positive selection for amino-acid substitutions indicates an important function, which, for this gene family, is as yet unknown. Nevertheless, these observations attest to the importance of duplication and divergence as key evolutionary mechanisms. So, although segmental duplications can be a positive nuisance for genome sequencers, they provide some fascinating clues about the history of our genome.