As the technology for gene sequencing becomes more powerful, the prospect of personalized medicine draws ever closer. On 4 September, genomics pioneer Craig Venter followed in the footsteps of James Watson and published his complete genome sequence (see page 6). Plans are already being laid for hundreds more personal genomes to be sequenced (see Nature 447, 358–359; doi:10.1038/447358a 2007).

But a publication that will attract rather less attention than these personal genomes illustrates why researchers should ensure that genomic data are collected not just from humans and their closest relatives, but from every far-flung branch of the tree of life.

The paper in question focuses on segments of 'ultraconserved' DNA — sections that have stayed exactly the same throughout recent vertebrate evolution, and are identical in humans, rats and mice (see page 10). The available evidence suggests that this extreme example of DNA conservation is no accident: the sequence stays because there is a strong selective force weeding out mutations in it. In other words, it is likely to be important to its host.

Yet when researchers based at Lawrence Berkeley National Laboratory in California removed four pieces of ultraconserved DNA from different mice, it had absolutely no effect on the rodents (N. Ahituv et al. PLoS Biol. 5, e234; 2007). This counterintuitive result contradicts predictions based on genetic conservation and the shaping of our genomes during evolution. Reconciling it with what scientists currently know would be easier if geneticists could figure out where ultraconserved DNA comes from, or what its function might be. Researchers have suggested that it could be involved in splicing RNA transcripts (J. Z. Ni et al. Genes Dev. 21, 708–718; 2007) or in enhancing transcription (L. A. Pennacchio et al. Nature 444, 499–502; 2006).

But so far there has been just one report on the origin of some ultraconserved DNA (G. Bejerano et al. Nature 441, 87–90; 2006). A team from the University of California, Santa Cruz, traced the origins of one ultraconserved region back to a group of ancient fishes, including the coelacanth. This was only possible because another group of researchers had previously opted to sequence a few segments of coelacanth DNA. As Gill Bejerano, a former member of the Santa Cruz team, says: “If the coelacanth people hadn't been interested in that puny 1% of the genome we would not have the answer. Who knows what other information is out there?”

It is clear that efforts to understand the mechanisms of evolution will benefit from getting as much genetic information on as many diverse organisms as possible. As it happens, the US National Human Genome Research Institute (NHGRI) announced in May that it would add a fly and a worm to its ENCODE project, which aims to catalogue all the functional parts of the human genome, in order to meet that project's human goals (see Nature 447, 361; 2007).

A better understanding of DNA function will come only from generating data from diverse genomes.

The NHGRI and the other main public backer of genomics research in the United States, the Department of Energy, are each committed to comparative genomics. But under the influence of the 'roadmap' of the US National Institutes of Health (NIH), which emphasizes the translation of research findings into the clinic, the NHGRI is moving more forcefully into purely human genomics. The biggest new projects recently announced by the NHGRI are all human-centric, such as the Cancer Genome Atlas and a pair of initiatives to hunt for the genetic causes of human disease — the Genetic Association Information Network and the Genes, Environment and Health Initiative. The institute would also like to embark on a major human cohort study (F. S. Collins and T. A. Manolio Nature 445, 259; 2007), and it is setting up a strategy for sequencing the genomes of our closest relatives, the non-human primates.

This is all understandable enough: the public is entitled to expect that the results of NIH research will be useful, when possible, to public health. But a better understanding of DNA function and the consequence of mutation will come only from generating more data from diverse genomes, backed by the bioinformatics capability that is needed to annotate them. That way scientists can learn more about the extent of DNA conservation throughout the living world — and, ultimately, tease out a deeper comprehension of the human genome.