News and Views

Nature 407, 466-467 (28 September 2000) | doi:10.1038/35035195

Genomics: Use your neighbour's genes

Don Cowan

On page 508 of this issue, Ruepp and colleagues1 describe the complete genome sequence of the acid- and heat-loving microorganism Thermoplasma acidophilum . This hardy organism, which lacks a cell wall, grows best on organic substrates at pH 2 and 59 °C. It was first isolated, in the late 1960s, from a self-heating ore pile2; such ore piles generate heat through internal microbial activity.

Microbial physiologists and structural biologists have long been fascinated by the ability of this microorganism to grow at high temperatures and low pH without the structural protection of a conventional cell wall. T. acidophilum is also interesting from an evolutionary perspective. Its cellular morphology seems primitive, and it contains complexes involved in protein folding, degradation and turnover that look like simple versions of related structures in eukaryotic cells (loosely, those cells with a nucleus — the type of cell that makes up higher organisms such as you and me). These facts intrigue evolutionary biologists, who have speculated that T. acidophilum is an ancestor of the eukaryotic cell.

Initially, T. acidophilum was classified as a thermophilic mycoplasm — a heat-loving example of a group of primitive, gliding bacteria, which lack cell walls2. But following analysis of its lipid composition and ribosomal RNA sequences, it was reassigned to the new 'third domain of life' — the Archaea3 (Fig. 1). T. acidophilum is the ninth member of the Archaea for which the genome has been completely sequenced4, 5, 6, 7, 8, 9, 10. All except one of these microorganisms are heat- loving. Why is there this focus on the thermophilic Archaea? The explanation is threefold and lies in the intense interest in the biochemistry of these unusual organisms, in the possibility that they represent the earliest forms of life, and in the biotechnological potential of their genes and gene products.

Figure 1: The three domains of life, showing how Thermoplasma acidophilum — the latest archaeon whose genome has been sequenced1 — fits into the evolutionary scheme of things.
Figure 1 : The three domains of life, showing how Thermoplasma acidophilum
 |[mdash]| the latest archaeon whose genome has been sequenced
|[mdash]| fits into the evolutionary scheme of things. Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact

On the basis of its primitive morphology and the presence of what seem to be several primitive cellular structures, T. acidophilum was once thought to be an ancestor of eukaryotic cells. But details of its genome sequence1 make this unlikely. This phylogenetic tree is based on 16S ribosomal RNA sequences (modified from ref. 12).

High resolution image and legend (72K)

Thermoplasma acidophilum has one of the smallest of the archaeal genomes to have been sequenced so far. Even so, the speed at which Ruepp et al.1 sequenced the genome was remarkable. The full sequence of over 1.5 million base pairs was obtained from only 7,855 sequencing reactions — an effective yield of 199 base pairs per reaction, compared with the 66 base pairs per reaction for the slightly larger genome of the hyperthermophilic bacterium Thermotoga maritima11. The keys to this efficiency were the use of sequencing vectors containing very large DNA inserts; an extended-sequencing method referred to as 'primer walking'; and a policy of stopping the sequencing of an insert when the primer walking encountered a stretch of DNA whose sequence was already known.

So, what can we learn from this genome? The most startling observation is the high proportion of genes that seem to have been acquired from other species. For example, 17% of all identified 'open reading frames' (the parts of genes that encode proteins) have relatives in the not-yet-completely sequenced genome of the archaeon Sulfolobus solfataricus.

There may be several reasons why T. acidophilum has such an extraordinary ability to acquire external genes. First, environmental proximity is clearly important. Microorganisms of the genus Sulfolobus might be the most common archaeal species in the habitat occupied by T. acidophilum. In addition, a further 17% of the open-reading frames of T. acidophilum are 'bacteria-like'. So it might also have acquired some of its genes from bacteria such as Alicyclobacillus, Thiobacillus or Sulfobacillus, the habitats of which overlap with that of Thermoplasma. Second, the absence of a conventional, protective cell wall could be particularly significant: a cell wall is a major barrier to the entry of large molecules into a cell. Finally, the T. acidophilum genome might not be protected by a restriction/modification system, a set of enzymes designed to recognize and destroy foreign DNA. The organism has no 'restriction endonuclease' activity, although its genome might encode a DNA methyltransferase, normally part of a restriction/modification system, and restriction endonuclease genes have little sequence similarity and cannot be recognized in a gene sequence.

Interestingly, the Sulfolobus-like genes in the T. acidophilum genome are clustered into several (at least five) discrete regions. Ruepp et al. conclude that only a few gene-transfer events occurred, each involving movements of large chunks of genetic sequence. But the transfer of smaller gene fragments between species tends to be more common, raising the question of why this seems not to have happened for T. acidophilum.

One issue can be almost settled by the details of this new genome sequence: whether T. acidophilum is an ancestor of eukaryotic cells. Ruepp et al. compared T. acidophilum genes with those in bacterial and eukaryotic databases. The results show that, if anything, the T. acidophilum genes are more similar to bacterial genes than to eukaryotic ones. Key 'marker' genes found in eukaryotes (such as genes encoding subunits of the nuclear pore complex) are not found in the T. acidophilum genome.

Finally, on a different note, the completion of another genome sequence reminds us how much we still do not know about gene function as a whole. Of the predicted 1,509 open reading frames in the T. acidophilum genome, 29% are akin only to 'hypothetical' open reading frames in other genomes, and 16% have no relatives elsewhere. This means that, as yet, we do not know what 45% of the protein-coding regions in the T. acidophilum genome do. That is a lot of genes. These percentages are typical for newly sequenced genomes. But the results serve as a reminder of the need both for more advanced data-mining techniques (which would increase our ability to pick out similar sequences from different genomes and to identify putative functions) and for the continuation of more classical molecular and functional research.



  1. Ruepp, A. et al. Nature 407, 508–513 (2000). | Article | PubMed | ISI | ChemPort |
  2. Darland, G. et al. Science 170, 1416– 1418 (1970). | PubMed | ISI | ChemPort |
  3. Woese, C. R. & Fox, G. E. Proc. NY Acad. Sci. 74 , 5088–5090 (1977).  | ChemPort |
  4. Kawarabayasi, Y. et al. DNA Res. 6, 83–101 (1999). | Article | PubMed | ChemPort |
  5. Klenk, H. P. et al. Nature 390, 364–370 (1997). | Article | PubMed | ISI | ChemPort |
  6. Smith, D. R. et al. J. Bacteriol. 179, 7135– 7155 (1997). | PubMed | ISI | ChemPort |
  7. Bult, C. J. et al. Science 273, 1058– 1073 (1996). | PubMed | ISI | ChemPort |
  8. Maeder, D. L. et al. Genetics 152, 1299– 1305 (1999). | PubMed | ISI | ChemPort |
  9. Kawarabayasi, Y. et al. DNA Res. 5, 55–76 (1998). | Article | PubMed | ChemPort |
  10. Kawashima, T. et al. Proc. J. Acad. B 75, 213– 218 (1999). | ISI |
  11. Nelson, K. E. et al. Nature 399, 323–329 (1999). | Article | PubMed | ISI | ChemPort |
  12. Pace, N. R. Science 276, 734–739 ( 1997). | Article | PubMed | ISI | ChemPort |