Main

Cyanobacteria of the genus Prochlorococcus are the most abundant photosynthetic microorganisms inhabiting the oceans, key factors in the carbon cycle and a model organism in environmental microbiology (Partensky and Garczarek, 2010). They can be broadly classified into high-light and low-light (LL)-adapted ecotypes (Rocap et al., 2002). These ecotypes exhibit distinct distributions both vertically in the water column and geographically across oligotrophic tropical and subtropical waters (Bouman et al., 2006; Johnson et al., 2006; Zwirglmaier et al., 2008).

In past years, the genomes of over a dozen isolates of Prochlorocococus have been fully sequenced (for example, Kettler et al., 2007) and over a hundred single-cell-amplified partial genomes have been described (Malmstrom et al., 2013; Kashtan et al., 2014). All of them have revealed that they cannot use nitrate as a nitrogen source. However, new uncultivated lineages of Prochlorocococus have been identified in the environment using culture-independent techniques based on the sequencing of the 16S rRNA gene and related genomic regions (Lavin et al., 2010; West et al., 2011; Mühling, 2012; Malmstrom et al., 2013). On the other hand, nitrate assimilation rates were reported for uncultivated deep populations of Prochlorococcus in the Western Atlantic Ocean (Casey et al., 2007). In adition, genes necessary for nitrate assimilation associated to Prochlorococcus were identified in the global ocean sampling metagenomic database (Martiny et al., 2009) and in metagenomes of flow-cytometry-sorted Prochlorococcus populations (Batmalle et al., 2014).

Important uncultivated Prochlorococcus lineages include those thriving in anoxic marine zones (AMZs), where oxygen concentrations fall below the detection limit of modern sensors, light is scarce, but inorganic nutrients are plentiful (Goericke et al., 2000; Ulloa et al., 2012). Phylogenetic analysis using the 16S–23S rRNA internal transcribed spacer region revealed that the AMZ-associated Prochlorococcus assemblages are mainly composed of two novel LL ecotypes (termed LL-V and LL-VI), which correspond to basal groups linking Prochlorococcus with marine Synechococcus (Lavin et al., 2010), the other dominant marine picocyanobacterium. However, no genomic or physiological information exists for these AMZ lineages.

Here we report results from a metagenomic analysis carried out on environmental genomic sequences retrieved from a sample collected at 60 m depth within the AMZ of the eastern tropical South Pacific (Supplementary Figure S1), where dissolved oxygen was undetectable and inorganic nutrients were abundant (Supplementary Figure S2a; Thamdrup et al., 2012). The microbial community was enriched in Prochlorococcus, shown to comprise ~10% of cell abundance, versus ~0.7% of Synechococcus, assessed by flow cytometry (Supplementary Figure S2b). Blast analysis of the taxonomic affiliation of sequences matching the rpoC region 1, a taxonomic marker for cyanobacteria based on a single-copy gene (Palenik, 1994), showed an rpoC gene relative abundance of 86% for Prochlorococcus and 14% for Synechococcus (Supplementary Table S1), supporting the flow cytometry results. Moreover, of the 15% protein-coding sequences assigned to cyanobacteria, 10% binned with Prochlorococcus and 5% with Synechococcus (Supplementary Figure S3). Of those assigned to Prochlorococcus, 90% were related to the LL ecotypes MIT9313 and MIT9303, the closest reported relatives to the AMZ lineages with genomes fully sequenced (Lavin et al., 2010). General statistics of this AMZ metagenome are shown in Supplementary Tables S2 and S3.

Analysis of de novo-assembled contigs revealed the presence of several large contigs that binned with Prochlorocococus. In particular, a single contig was found to encode genes related to urea and nitrate uptake and assimilation (contig 51148, GenBank accession number KM282015; 10 300 bp; Figure 1), in synteny with those in Synechococcus WH8102. The genes in the urease gene cluster (ureABCD) presented high identity to those described for Prochlorococcus MIT9313 and MIT9303 (Rocap et al., 2003; Supplementary Figure S4). Notably, the nitrate/nitrite transporter napA and assimilatory nitrate reductase narB were also found within the same contig (Figure 1a), as well as the genes moeA and mobA (Supplementary Figure S5) involved in the biosynthesis of the Mo-cofactor and necessary for the narB function (Flores et al., 2005). None of these genes have been found in any of the genomes of Prochlorococcus sequenced and described so far. However, homologues that presumably come from uncultivated relatives of Prochlorococcus have been found in the global ocean sampling database (Martiny et al., 2009) and in metagenomes of uncultured, sorted Prochlorococcus populations (Batmalle et al., 2014).

Figure 1
figure 1

Genomic characteristics of the nitrogen assimilation operon found in contig 51148. (a) Schematic representation of syntenies among contig 51148, Prochlorococcus MIT9313 and MIT9303 genomes, and Synechococcus WH7803 and WH8102 genomes centered on nitrate and urea assimilation genes. Identities (%) among sequences are shown in gray. (b) GC content. (c) Contig coverage. (d) Proximity matrix (Euclidean distance) of the difference in codon usage pattern for the genomes of Prochlorococcus (Pro) and Synechococcus (Syn), and of contig 51148. The shortest distance (dark blue) indicates the highest proximity. (e) Spearman rank-order correlation between tetranucleotide frequency of contig 51148 and those of genomes of marine Prochlorococcus (Pro) and marine Synechococcus (Syn). The highest correlation is shown in dark green.

The GC content of contig 51148 was ~51.1% (Figure 1b) and similar to that of LL Prochlorococcus and some marine Synechococcus (Kettler et al., 2007). Likewise, the narB gene had a GC content of 52%, which is less than the ~60% of those in the marine Synechococcus strains WH8102 and WH7803 (to which it presented the highest nucleotide identity), but significantly higher than the ~40% GC of the global ocean sampling high-light Prochlorococcus narB (Supplementary Figure S6). Analysis of codon usage patterns (Yu et al., 2012) and tetranucleotide frequencies (see Supplementary Material and Methods) showed that the cyanobacterial portion of the metagenome and contig 51148 exhibit the highest similarity with LL Prochlorococcus MIT9303 (Figures 1d and e). Additionally, nucleotide identities and phylogenetic analysis confirmed that the urease genes of contig 51148 were associated more closely with Prochlorococcus than Synechococcus (Supplementary Table S4 and Supplementary Figure S4).

The homogeneous GC content of contig 51148, the differences in codon usage bias with Synechococcus and phylogenetic analyses of AMZ narB and napA (Figures 2a and b) all suggest that the genetic potential for nitrate uptake and assimilation was not obtained recently by horizontal gene transfer, but instead potentially were retained from a common ancestor with Synechococcus. Mapping the presence/absence of the different nitrate utilization genes onto the cyanobacteria 16S rRNA phylogenetic tree is consistent with this hypothesis (Supplementary Figure S7).

Figure 2
figure 2

Phylogenetic trees for nitrate assimilation and uptake genes. Maximum-likelihood phylogenetic trees of (a) narB- and (b) napA-predicted amino acid sequences found in contig 51148. Evolutionary history was inferred using neighbour joining (NJ), maximum parsimony (MP) and maximum likelihood (ML). Bootstrap support values for 100 replications are shown at the nodes (NJ/MP/ML).

In summary, our results indicate that AMZ Prochlorococcus lineages have the genetic potential for urea and nitrate assimilation, likely an adaptation to the unique nutrient-rich environment where they thrive. Additional genomic characteristics that could explain their high abundance in the oxygen-deficient and very-LL waters of AMZs remain to be assessed.