Sequencing project reveals microbial cache of protein families.
Sequencing neglected microbes could accelerate the discovery of new protein families and biological traits, a study published today suggests.
Genome sequencers have tended to gravitate towards microbes with special traits, such as deadly pathogens or deep-sea extremophiles, but as a result sequencing efforts have been piecemeal and have left blank large portions of the microbial tree of life. In this week's issue of Nature, Jonathan Eisen at the University of California, Davis, and his co-authors analyse the complete sequences of 56 bacterial and archaeal species1 that were selected to help fill those gaps.
The approach nets an average of 1,000 protein families for each genome sequenced, and so far the researchers have identified 1,768 protein families that seem to be new to science. Discovering novel enzymes that can be used in industrial processes such as bioremediation or biofuel production is one of the main practical goals of microbial genomics.
Donald Bryant, a microbial physiologist at Pennsylvania State University in University Park who was not involved in the research, says that this evolutionarily guided strategy "has the potential to direct people toward new discoveries".
The study represents the first publication of the Genomic Encyclopedia of Bacteria and Archaea, launched in 2007 by the Joint Genome Institute (JGI) in Walnut Creek, California, where Eisen holds an adjunct appointment. The project developed after he completed eight microbial genome sequences and realized — with some disappointment — that the deepest branches of the microbial tree of life were still murky. JGI director Eddy Rubin suggested starting up a large-scale sequencing project to rectify the situation, but only if Eisen and his collaborators could demonstrate the practical benefits of this basic research.
Now Eisen believes he has made his case. The team examined a microbial family tree that had been assembled on the basis of the gene encoding the RNA that forms the small subunit of the bacterial ribosome, and chose 200 lineages whose members had never been sequenced. Collaborators at the German Collection of Microorganisms and Cell Cultures in Braunschweig then grew enough of the microbes to begin the sequencing; the team prioritized work on ones that would be easiest to obtain sufficient quantities of DNA. Compared with a random selection of microbes, Eisen says, evolutionarily guided sequencing can net more proteins and improve the reconstruction of the tree of life.
One species, called Haliangium ochraceum, which lives in coastal saltwater environments, has a gene similar to those that make actin proteins in eukaryotes — organisms whose cells contain complex membrane-bound structures, such as fungi, plants and animals. This is the first time that a version of this 'textbook' eukaryotic gene has been discovered in bacteria. Eisen suspects that the bacterium stole the gene from a eukaryote, and may use it to disrupt the growth of eukaryotic prey or competitors such as fungi. "This is just an emblematic reason to do unbiased sampling," Eisen says. "Nobody would have sequenced the genome of this organism other than for phylogenetic reasons."
He believes that sequencing just 1,520 microbial strains selected to fill in the microbial tree of life could encompass half of the diversity of the bacteria and archaea that can be cultured in a laboratory. Ultimately, Eisen hopes that this study will convince funding agencies to support the sequencing of not only those organisms, but thousands more that have never been cultured.
But not everyone is enamored of the strategy. "We all agree that many more microbial genomes are needed, but it does not really matter in what order these will be sequenced," says Stephan Schuster of Pennsylvania State University. For Schuster, the most important thing is that genomes are finished and annotated rather than being reported as "laundry lists of novel features".
Wu, D. et al. Nature 462, 1056-1060 (2009).