Main

Recent analysis of the Sargasso Sea and Global Ocean Survey metagenomes has provided evidence of regions of hyper variability in the genomes of SAR11 (Rusch et al., 2007; Wilhelm et al., 2007). This was achieved using fragment recruitment (Wilhelm et al., 2007), whereby environmental sequence data were compared against the genome of ‘Candidatus Pelagibacter ubique’ HTCC1062 and then assembled (Rusch et al., 2007).

We have confirmed a region of hyper variability in an SAR11 genome by screening a metagenomic fosmid library for SAR11-like 16S rRNA gene-containing clones. SAR11-specific 16S rRNA gene primers (Béjà et al., 2000) were used to identify clones that cover the 48-kb hyper-variable region 2 (HVR2) of ‘Cand. P. ubique’ HTCC1062. A clone was identified that contained an SAR11 16S rRNA gene sequence from a library of 10 000 clones, constructed with surface seawater from the Western Channel Observatory (WCO: http://www.westernchannelobservatory.org.uk/, 50.25°N: 04.212°W) (see Supplementary Information for description of methods).

PCR screening of the fosmid library yielded only one positive clone (clone 01–003783: GenBank accession number EU410957; G/C content 43.2%), which contained a 16S rRNA gene with 98.8% sequence identity to that of ‘Cand. P. ubique’ HTCC1062. Phylogenetic analysis of the 16S rRNA gene sequence confirmed that it belongs to the SAR11 clade of bacteria (Supplementary Figure S1). The small number of SAR11 fosmid clones produced is consistent with the findings of others (Suzuki et al., 2004; DeLong et al., 2006) and is puzzling given the high abundance and ubiquitous distribution of this clade (Morris et al., 2002). It has been suggested that underrepresentation might be due to the presence of genes, in proximity to the ribosomal RNA operon, which express products toxic to the host cell (Béjà et al., 2000). To our knowledge, there is no evidence of such genes from published genomes.

Table 1 summarizes the identity and location of each coding sequence (CDS) of fosmid clone 01–003783. Apart from the ribosomal RNA operon, a total of 25 complete and one partial CDS were predicted. Of these, only 16 showed high sequence similarity to genes from ‘Cand. P. ubique’ HTCC1062 (Table 1). To exclude the possibility that inclusion of these CDSs in the fosmids resulted from a chimeric cloning event, PCR primers were designed to amplify fragments between each of the CDSs from 13 to the rRNA operon (Figure 1). All of these regions were detected in the environmental DNA used to produce the fosmid library. As all CDSs were complete, a chimera could only have formed between the CDSs. By amplifying and sequencing these regions from environmental DNA, we have demonstrated that this pattern of genes existed prior to library construction and hence cannot be a chimera.

Table 1 Details of coding sequences (CDSs) and ribosomal RNA sequences from the fosmid clone (EU410957)
Figure 1
figure 1

Diagram of the fosmid clone sequence. Large arrows represent relative size and orientation of open reading frames (ORFs). 16S, 23S rRNA and the intergenic transcribed spacer region (ITS) are represented by two boxes titled 23S and 16S. Small solid arrows represent regions amplified by PCR from environmental DNA to validate clone structure. Core conserved genes represent area with synteny to ‘Candidatus Pelagibacter ubique’.

Only the four CDSs transcribed in the 3′-direction after the rRNA operon shared synteny with the ‘Cand. P. ubique’ HTCC1062 and HTCC1002 genomes (Table 1). The 11 CDSs with no homologue in the genome of HTCC1062 represent potential niche-specific acquisitions that may provide a competitive advantage (Wilhelm et al., 2007). For example, acquisition of a gene involved in 5′-nucleotidase and uridine diphosphate (UDP)-sugar hydrolysis (CDS 5) might benefit utilization of extracellular nucleotides as a phosphate source under conditions of phosphate starvation (Rittmann et al., 2005). The missing gene products required for these processes to be fully functional may be encoded within the 25 kb of the 48-kb large HVR2 that were not part of the sequence of fosmid clone 01–003783. However, we were not successful in identifying a contiguous clone within the fosmid clone library. Their absence suggests that these genes are either non-functioning or involved in other processes. The former is unlikely due to the genomic streamlining of SAR11 organisms (Giovannoni et al., 2005), which would result in the loss of non-functioning elements.

It is interesting that 22 of the CDSs of fosmid clone 01–003783 show higher sequence similarity and have a closer phylogenetic relationship to the respective sequences of bacteria not belonging to the SAR11 clade (Table 1; Supplementary Figure S2). This is in contrast to the phylogenetic inference based on the 16S rRNA gene sequence within the same fosmid clone (Supplementary Figure S1). A possible explanation for this may be that CDSs 1–22 are a result of horizontal gene transfer. Although there is no direct evidence for phage-mediated horizontal gene transfer in fosmid clone 01–003783, the possibility is supported by the low G/C content (potentially indicative of phage genes) of six of the CDSs in the fosmid (Table 1). Also, phage integrase genes have been documented in the other 49% of HVR2 that was not in this fosmid clone (Rusch et al., 2007). The ecological ramifications of horizontal gene transfer for SAR11 bacteria have been discussed previously (for example, Rusch et al., 2007; Wilhelm et al., 2007).

In conclusion, the presence of a region of extreme variability within bacterial genomes (Rusch et al., 2007, Wilhelm et al., 2007) has been confirmed in fosmid clone 01–003783; this is a rare example of a fosmid that contains the SAR11 16S rRNA gene. The variable region HVR2 provides evidence in support of the hypothesis of bacterial promiscuity (Coleman et al., 2006; Fraser et al., 2007, Vergin et al., 2007). In particular, this study demonstrates the presence of genes within the HVR2 region, which are not present in the genome of ‘Cand. P. ubique’ HTCC1062 or HTCC1002. Therefore, fosmid clone 01–003783 provides further evidence for the genomic plasticity of this bacterial lineage and adaptability to specific environmental conditions, which may have been acquired through horizontal gene transfer.