The production of bioavailable nitrogen (N) supplied via biological dinitrogen (N2) fixation, i.e., the reduction of N2 gas to ammonium, supports primary productivity in wide areas of the oligotrophic oceans [1]. Besides N2-fixing (or diazotrophic) cyanobacteria, recent studies have also suggested that non-cyanobacterial diazotrophs (NCDs) might play key roles in marine N2 fixation [2]. Within the diversity of the nitrogenase-encoding nifH gene, a marker gene frequently used to assess the diversity of diazotrophs, a group of Gammaproteobacteria known as Gamma-A (also γ-24774A11 or UMB) [3, 4] has been suggested to be one of the most important NCDs [5, 6]. However, all current knowledge of Gamma-A is solely based on amplification of a small fragment of its nifH gene, which can be detected by degenerate PCR-primers [7] or quantified using specific primers and probes in quantitative PCR surveys [4, 8, 9]. Applying these approaches, Gamma-A has been found in the warm, oligotrophic, and fully oxygenated surface waters of tropical and subtropical latitudes [3,4,5,6, 8, 10,11,12], leading to the hypothesis that this diazotroph might rely on either a light-driven machinery or on photosynthetic products from other organisms [13]. However, the lack of genomic information has prevented further insights about the metabolism and the ecology of Gamma-A.

One of the many unresolved questions about Gamma-A concerns its cell-size distribution. The size-class where N2 fixation occurs is important because it determines, for example, the efficiency of sinking and sequestration of fixed N into deep waters [14]. Despite the widespread distribution of Gamma-A, none of the recent global metagenomic studies that explore marine diazotrophic diversity within the free-living size-fraction (<3 μm) have detected Gamma-A-related nifH sequences [15, 16]. Indeed, among previous PCR-based studies that quantified the abundance/expression of the Gamma-A nifH gene (Table S1), the only two studies that explicitly distinguished between the <3 and >3 μm size-fractions showed that Gamma-A was exclusively present in the >3 μm size-fractions [17, 18]. Furthermore, Gamma-A-related nifH sequences have recently been associated with sinking particles ranging in size from 50 to 200 μm [14] and the gut content of copepods [19]. Altogether this suggests that Gamma-A may either be large in size or attached to particles or other organisms, but a thorough global analysis of the distribution of Gamma-A across size fractions has never been conducted.

Here, we aimed to gain further knowledge of Gamma-A by searching for Gamma-A nifH-related sequences within the Marine Atlas of Tara Ocean Unigenes (MATOU) database [20] and by exploring their presence and activity (via expression of nif transcripts) in metagenomes and metatranscriptomes across four planktonic size-fractions (0.8–5, 5–20, 20–180, and 180–2000 µm) from the sunlit ocean sampled during the global Tara Oceans expedition [20]. Methods used for the identification and quantification of Gamma-A genes in the Tara Oceans dataset are described in the Supplementary information.

Results and discussion

We recruited only one Gamma-A nifH-containing contig (contig ID “MATOU-v1_23614344”) from MATOU that shared 99.7% nucleotide identity with the Gamma-A nifH gene (AY896371.1). The MATOU-v1_23614344 contig (hereafter Gamma-A-MATOU) had a length of 4737 nucleotides and we predicted the complete nifH, nifK, nifD and nifT gene sequences (Table 1). The low percent identity that the Gamma-A-MATOU had compared with current sequenced genomes highlights the gap of representative genomes within this Gammaproteobacterial diazotrophic cluster.

Table 1 Gene annotation of the Gamma-A-MATOU contig.

We analyzed the abundance and the expression of the Gamma-A-MATOU across size-fractions in surface (Fig. 1) and deep chlorophyll maximum (DCM) (Fig. S1) waters of the global ocean. Gamma-A-MATOU was detected in 49 (out of 355) metagenomes and 65 (out of 354) metatranscriptomes, corresponding to 24 stations located in the Mediterranean Sea, Indian Ocean, North, and South Atlantic Oceans and North and South Pacific Oceans (Figs. 1 and S1). Within the 5–20 µm size-fraction, the number of metatranscriptomes with presence of Gamma-A-MATOU (n = 27) exceeded the number of metagenomes (n = 15). Whereas the relative abundance of genes and transcripts of the Gamma-A-MATOU was remarkably constant across the four size-fractions at the DCM, it was much more variable between size-fractions in the surface layer, being higher in the 0.8–5 and 5–20 µm size-fractions than in the two largest size-fractions (Fig. 2).

Fig. 1: Distribution of Gamma-A-MATOU in the surface ocean.
figure 1

Abundance (metagenome-based; left panel) and expression (metatranscriptome-based; right panel) of Gamma-A-MATOU across size fractions are shown. The area of the bubble is proportional to the abundance of metagenomic reads (blue) or transcripts (red) of Gamma-A-MATOU for each sample. Abundances of metagenomic and metatranscriptomic reads are expressed as RPKM (Reads Per Kilobase covered per Million of mapped reads).

Fig. 2: Abundance and expression of the Gamma-A-MATOU nitrogenase gene cluster across size fractions and depths.
figure 2

Boxplots representing the abundance (blue) and expression (red) of Gamma-A-MATOU. The number of samples used for the calculation of each boxplot is indicated between parentheses. Abundances of metagenomic and metatranscriptomic reads are expressed as RPKM (Reads Per Kilobase covered per Million of mapped reads).

This global distribution of active Gamma-A in tropical and subtropical photic waters is consistent with previous primer-based studies [5, 6] but further shows its ubiquitous occurrence across planktonic size-fractions spanning at least three orders of magnitude in size (Figs. 1 and S1). The presence of diazotrophs in large planktonic size-fractions, which might include sinking particles and the guts of copepods [14, 19], has been linked to a more efficient sinking and sequestration of fixed N in deep waters [21]. Hence, our results suggest that Gamma-A may be one of the most important NCD in marine nitrogen fixation. Its presence in the 0.8–5 µm size-fraction, together with its previously reported absence in size-fractions <3 µm [15,16,17,18], suggests that most of the signal from the 0.8–5 µm size-fraction was coming from particles larger than 3 µm. Furthermore, the relative abundance of genes and transcripts of the Gamma-A-MATOU were generally higher in the 0.8–5 and 5–20 µm size-fractions, indicating that Gamma-A might be fixing N2 preferentially in these fractions, although transcripts were detected even in the largest size-fraction (180–2000 µm) but with lower abundances (Fig. 2). Our analysis further suggests that Gamma-A might possess the ability to attach to different marine phytoplankton species spanning sizes from 3 to 2000 µm, or to be associated with organic particles of various sizes. Sinking particles have been suggested to be potential niches for heterotrophic N2 fixation [2, 14, 22] due to microaerobic environments generated within these particles by microbial remineralization of organic matter [23, 24] and may provide suitable conditions for oxygen-sensitive nitrogenases. Gamma-A might also form aggregates of cells like those observed in other N2-fixing gammaproteobacterial species such as Pseudomonas stutzeri, which might be a mechanism to control O2 diffusion and facilitate N2 fixation in oxic environments [25]. Alternatively, Gamma-A might be a filamentous N2-fixing microorganism such as the diazotrophic cyanobacterium Trichodesmium [26], as Trichodesmium nifH gene sequences were also recruited across different size-fractions from this same dataset, thus pointing to the presence of filaments of different sizes (data not shown).

The new information of Gamma-A nitrogenase-related genes provided here might help in the design of new molecular probes that combined with visualization techniques such as geneFISH [27] and cell-sorting techniques [28] will yield insight into the genome and the lifestyle of this cosmopolitan diazotroph. Future efforts focused on reconstructing prokaryotic genomes of larger planktonic size-fractions might help to reveal the metabolic potential of some of the currently elusive uncultured diazotrophic microorganisms such as Gamma-A.