Main

Mycoplasma gallisepticum is the sixth and latest mycoplasma species to be fully sequenced1, the previous being Mycoplasma genitalium (0.58-Mb genome; human host), Mycoplasma pneumoniae (0.82-Mb genome; human host), Ureaplasma urealyticum (0.75-Mb genome; human host), Mycoplasma pulmonis (0.96-Mb genome; rodent hosts) and Mycoplasma penetrans (1.35-Mb genome; human host). Mycoplasmas are parasitic intracellular pathogens that are thought to have descended from a Gram-positive ancestor with a low G+C content, and undergone a process of reductive evolution in their new and isolated niches. M. gallisepticum is a heavyweight in this group with a genome of 0.99 Mb that encodes a predicted 742 coding sequences (CDSs). Comparative analysis of the mycoplasmas indicates that the minimal gene-set is 265–350 essential genes.

Responsible for avian chronic respiratory infections, this economically important pathogen is spread by aerosols and vertical transmission. Over 10% of the CDSs in the genome encode products that belong to a paralogous multi-gene family distributed between five loci that encodes 43 variable lipoproteins. Variations in the sequences of the lipoprotein genes are thought to produce enough antigenic variation to facilitate immune evasion. Only one member of this family is usually expressed at any one time and there is evidence of phase variation. The proteins encoded by many of these lipoprotein genes are predicted to have lectin-binding domains, which might have a role in adherence to host cells.

There are many other membrane-associated proteins in M. gallisepticum, including several predicted transport proteins and a pared down — but still thought to be functional — set of the Sec translocation proteins. Twelve transposases that were identified seem to be responsible for continued gene rearrangements and loss. Of the 742 predicted CDSs, 36% are in the 'unique' or 'conserved hypothetical' protein category, highlighting our lack of understanding of even these simple microorganisms.

In a similar Lilliputian vein, the genome of Nanoarchaeum equitans , at 0.49 Mb, is even smaller than that of M. genitalium; it is not only the smallest archaeal genome sequenced, but the smallest microbial genome sequenced so far2. N. equitans is a newly discovered hyperthermophilic archaeon, which only grows in co-culture with another archaeon, Ignicoccus sp. Ribosomal protein and rRNA-based phylogenies have placed the branching point of N. equitans evolution early in the archaeal lineage, before that of the Crenarchaeota, Euryarchaeota and Korarchaeota, so it has been placed in a fourth phylum of the Archaea — the Nanoarchaeota.

Unlike other prokaryotes with a similar genome size, there is little evidence for reductive evolution in this genome. The N. equitans genome has a high gene density, with 95% of its DNA encoding proteins (522 CDSs) and stable RNA. N. equitans lacks the genes for de novo biosynthesis of amino acids, nucleotides, cofactors, and lipids — consistent with its parasitic lifestyle. One striking feature of the N. equitans genome is the presence of multiple 'split genes', the products of which are encoded by single genes in most other microorganisms, but which, in this organism, are encoded by two discrete CDSs, both of which are required for full functionality. The split sites for many of these genes are between domains that are present in the assembled protein product. The authors speculate that multi-domain proteins might have evolved from the fusion of two or more single domain proteins. The split genes in N. equitans could be a view of the ancestral state of genes from early in microbial evolution. This, together with other facets of the N. equitans lifestyle, such as its anaerobic and high-temperature growth, indicate a primitive existence. The authors concluded that N. equitans has not undergone a process of reductive evolution — like the mycoplasmas — but could represent a living microbial fossil.

As we begin a new year, the pace of microbial genome sequencing shows no sign of abating. Even though the range of genomes sequenced has broadened in scope to include more exotic and unusual prokaryotes, such as N. equitans and the planctomycete Pirellula, the Gamma purple proteobacteria and the Gram-positive bacteria are still dominant (Fig. 1).

Figure 1: Prokaryotic genomes published in 2003.
figure 1figure 1

Data taken from the Genomes OnLine Database (see Online links).