Introduction

Nitrogen (N)-fixing diazotrophs like cyanobacteria in the genus Trichodesmium are common in surface waters of the vast tropical and subtropical oligotrophic regions of the ocean such as the North Pacific and North Atlantic subtropical gyres (NPSG and NASG) [1, 2]. Atmospheric dinitrogen (N2) fixation by diazotrophs such as Trichodesmium plays an important role in these low nutrient subtropical gyres by fueling primary production and carbon (C) export in the ocean [1,3,4,5,6,7]. N2 fixation by the genus Trichodesmium alone accounts for approximately 50% of the input of new N in these oligotrophic regions, contributing significantly to the cycling of N and C [8, 9]. In addition, new estimates reveal that Trichodesmium contributes to the turnover of reduced phosphorus (P) compounds such as phosphite and phosphonate [10], driving the cycling of P and the production of the greenhouse gas methane in low P oligotrophic oceans [11, 12].

Characterization of the geochemical constraints of oceanic N2 fixation is the subject of intense study. P and iron (Fe) are recognized as major drivers of the distributions and activities of diazotrophs in the oligotrophic ocean gyres over both modern and geological time scales [5, 13]. Fe bioavailability is particularly important due to the high Fe quota of enzymes involved in N2 fixation (e.g., nitrogenase) and photosynthetic electron transport [14]. Models used to predict patterns in global N2 fixation suggest that P constrains diazotroph growth in the NASG, whereas Fe constrains diazotroph growth in the NPSG [13]. However, parameterizing phytoplankton activities and growth in biogeochemical models is challenging, thus leading to uncertainty in their outputs. Model parameters are typically derived from culture measurements of P and Fe uptake and these studies are predominately done with model strains that are not representative of field populations. For example, Trichodesmium erythraeum IMS101 is a commonly used isolate in laboratory-based studies [2], yet it belongs to a different phylogenetic clade than that which dominates the NASG and NPSG [15, 16]. An additional challenge is that the geochemical conditions in culture are not fully representative of the field, where Fe and P may both be low and where bioavailability varies over the range of chemical species and organic complexes that exist, which are difficult to characterize in marine environments [17, 18]. Recent modeling efforts have focused on some of these challenges by using resource competition theory to evaluate the role of organically complexed P in fueling NASG N2 fixation [19] and using resource ratio theory to highlight the apparent importance of both Fe and P in driving Trichodesmium biogeography and N2 fixation across oceanic provinces [20, 21]. These studies support the regional importance of Fe and P biogeochemistry in Trichodesmium physiological ecology.

Trichodesmium has acquired a number of adaptations to meet P and Fe demands in the oligotrophic regions where it occurs, such as the ability to regulate cellular P [22] and Fe quotas [23], take up different Fe [24] and P species [25], and access dissolved organic P (DOP) with enzymes like alkaline phosphatases (APs) and C-P lyases [23,26,27,28,29]. Notably, some of these adaptations could impose tradeoffs by increasing requirements for metal co-factors. For instance, access to the DOP pool might be limited by Fe bioavailability, because PhoX, the most widely distributed type of AP among marine bacteria [30], requires Fe as a metal cofactor [31, 32]. In fact, whole-water field incubations with added Fe in low Fe regions showed an enhancement of AP activity [33]. Such potential interactions and adaptations further exacerbate the complexities surrounding the identification and modeling of the geochemical drivers of N2 fixation.

In the field, Trichodesmium colonies are complex communities with a rich consortium of heterotrophic epibionts and other microorganisms [34,35,36]. As a consequence, many useful approaches for evaluating Fe and P bioavailability, such as enzymatic activity assays, are not necessarily specific to Trichodesmium [27, 35, 37]. Molecular approaches show promise in this regard since they can be tuned to examine Trichodesmium-specific signals, especially as the pathways for resource acquisition are increasingly better understood [25, 29,38,39,40]. Trichodesmium-specific transcriptional profiles may yield insights into the tradeoffs in resource acquisition and limitation patterns, as protein profiles have done for populations of the cyanobacteria Prochlorococcus [41]. Uncertainties remain in characterizing how resource bioavailability drives Trichodesmium physiological ecology in the field, and this, in turn, limits modeling efforts. Here, metratrancriptome profiling was used to identify Trichodesmium-specific signals from colonies in the NASG and NPSG in order to ascertain ecosystem-specific geochemical drivers of this globally significant diazotroph.

Material and methods

Field sample collection

Sampling took place during a cruise transect in the NASG aboard the R/ V Oceanus (OC471, April and May 2011) and as part of a time-series sampling in the NPSG at Station ALOHA, situated at 22.75°N, 158°W (HOE-DYLAN#7 and #9, August 2012) (Table S1, Fig. S1). A total of 12 samples were sequenced, 9 in the NASG and 3 in the NPSG. All Trichodesmium colonies were collected from near the surface (approximately within the upper 25 m) using a handheld 130 μm net, within 2 h of local solar noon. As nif genes in Trichodesmium display strong diel signals [42], samples were always collected at the same point relative to local noon to remove potential diel variability in overall gene expression. Single colonies were picked and rinsed three times in 0.2 μm filtered local surface water collected at 5 m with a Rosette sampling device. Between 10 and 15 washed Trichodesmium colonies per station were filtered onto 47 mm, 10 μm polycarbonate filters, which were then placed in 2 ml cryovials, snap-frozen and stored in liquid N until RNA extraction was performed in the laboratory. Total time from sample collection to preservation was roughly 15 min. In situ profiles of temperature and salinity, and photosynthetically active radiation (PAR) in the NASG were measured by a conductivity, temperature, and depth instrument deployed at each station. In the NPSG, PAR at the sea-surface was measured using a cosine sensor (LI-COR model LI-192) mounted on the top deck of the ship. Vertical profiles of downwelling PAR irradiance were obtained using a Free-Falling Optical Profiler (Satlantic). Both measurements were then used to compute the percent level of PAR with respect to the surface value on 1 m depth bins in the water column. For the NASG samples, nitrite + nitrate (\(NO_2^ -\) + \(NO_3^ -\)) and soluble reactive silica (Si) were determined from 125 ml of water filtered through a 0.2 μm polycarbonate filter and stored frozen (–20°C) in 10% HCl-cleaned bottles until analysis following the facility’s protocols at the Chesapeake Bay Lab at the University of Maryland (http://nasl.cbl.umces.edu/methods/WCC.html). Low-level dissolved inorganic phosphate (\(PO_4^{3 - }\)) was assayed using a modified MAGIC method [43] with a detection limit of 2.5 nM. All of the NPSG samples were assayed using the protocols of the Hawaii Ocean Time-series (HOT) program (http://hahana.soest.hawaii.edu/hot/protocols/protocols.html). Dissolved Fe (dFe) concentrations associated with the NPSG samples were collected, analyzed, and reported in Fitzsimmons et al. [17]. Surface dFe concentrations in the NASG were not directly measured in this study, and were computed as the dFe climatological average using data from stations sampled across the same geographical transect as this study (10°N-30°N and 70°W-25°W) throughout different seasons and years. The dFe climatological average was consistent with discrete surface values measured on a similar spring NASG transect reported by Chappell et al. [44]. Notably, all discrete \(PO_4^{3 - }\) and dFe values were consistent with the regional NASG and NPSG climatology [28, 44,45,46,47,48,49,50,51,52,53,54,55,56].

RNA extraction, mRNA purification, and sequencing

RNA was extracted from each of the filters using the RNeasy Mini Kit (Qiagen), following a modified version of the purification protocol for yeast. Briefly, lysis buffer with 0.01% of β-Mercaptoethanol and RNA-clean zirconia/Si beads (0.5 mm) were added to the filter and samples were vortexed for 5 min, placed on ice for 1 min, and vortexed again for 5 min. Samples were then processed following the remainder of the yeast protocol, as outlined by the manufacturer, and RNA was eluted in water. To eliminate potential DNA contamination, RNA was treated using the RNase-Free DNase Set (Qiagen) and then further purified and concentrated using the RNA Cleanup Protocol from the RNeasy Mini Kit (Qiagen). The RNA was eluted in Tris-EDTA (TE) buffer and potential eukaryotic RNA was removed using the MICROBEnrich kit (Thermo Fisher Scientific). Finally, for enrichment of bacterial mRNA and removal of ribosomal RNA, the enriched bacterial RNA was processed through the Ribo-Zero Magnetic Kit for Bacteria (Illumina, cat. no MRZMB126). Successful removal of ribosomal RNA from the samples was confirmed using a Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). RNA samples were prepared for sequencing using the Illumina TruSeq RNA prep kit and the resulting library was sequenced by the JP Sulzberger Columbia Genome Center (CUGC) on an Illumina HiSeq 2000 resulting in single-end reads of 100 bp, with an overall sequencing depth of ~30 million reads per sample (Table S2). Sequence data are deposited in the Sequence Read Archive, BIOPROJECT PRJNA374879.

Read mapping

Raw sequence data quality was visualized using FastQC and then cleaned and trimmed using Trimmomatic version 0.27 (single-end mode, 6-bp wide sliding window for quality below 20, minimum length of 25 bp). To evaluate the best read mapping approach, trimmed single-end reads from each of the samples were mapped to a combined Trichodesmium spp. genome, obtained by merging the T. erythraeum IMS101 genome (NCBI reference sequence: NC_008312) with the partial Trichodesmium thiebautii H94 genome (GenBank: LAMW00000000), resulting in an average of 6.5 ± 7.3% reads mapped. Reads were mapped using RNA-Seq by Expectation Maximization (RSEM) [57] with Bowtie2 [58] using default parameters. These low mapping rates were similar to other Trichodesmium field studies [59, 60], and likely reflect the presence of reads of heterotrophic bacterial epibionts, which are present at high concentrations in the colonies [35, 36], as well as the potential variability between the genomes of cultured Trichodesmium and field populations [59], as field populations encompass a diversity of Trichodesmium species [15, 16]. To target a Trichodesmium community more representative of the field, reads were mapped to Trichodesmium-identified genes from a custom Trichodesmium metagenome database assembled by Frischkorn et al. [34]. These field-specific Trichodesmium genes were previously generated and analyzed by Frischkorn et al. [34] from four Trichodesmium-identified genome bins in Trichodesmium colony metagenomes. Briefly, their approach utilized MaxBin 2.0 [61], Prodigal [62], and DIAMOND [63] against the NCBI nr database to generate genome bins, predict coding sequences and proteins, and identify taxonomy, respectively [34]. This custom Trichodesmium metagenome mapping approach yielded a 1.6 times higher mapping rate, on average, relative to the culture genome database (Table S2). With this custom Trichodesmium metagenome database mapping rates in the NPSG were not consistently lower than those from the NASG (Table S2). Moreover, the average percentage of reads mapping to sequences of each of the four Trichodesmium genomic bins defined in Frischkorn et al. [34] was similar in the NASG and NPSG (Fig. S2). Taken together, this suggests the custom Trichodesmium metagenome database [34] works well in both regions and, thus, was used for subsequent analyses.

Trichodesmium-only proteins predicted from the custom Trichodesmium metagenome database were clustered into 6710 orthologous groups (OGs), as described by Frischkorn et al. [34]. This yielded ~2.3 times more OGs than that derived from the T. erythraeum IMS101 genome (2982) as per Frischkorn et al. [34]. This disparity could stem from the presence of multiple species with different gene contents in Trichodesmium communities in the field and the fact that T. erythraeum IMS101 is not the dominant species in field populations [15, 16]. OGs were assigned putative annotations using the UniRef90 database [63] and the Kyoto Encyclopedia of Genes and Genomes (KEGG). Single functional annotations for entire OGs were determined by selecting the most abundant UniRef annotation for all proteins clustered into that group. KEGG annotations were used in some cases to refine the most abundant annotation. To most stringently categorize OGs identified as APs, annotations were refined with DIAMOND searches against representative proteins from COGs 3211 (PhoX), 1785 (PhoA), as well as the protein sequences of three previously identified putative APs in the IMS101 genome (PhoA: YP723031, PhoX: YP723360, and PhoX2: YP723924) [29], and as described in Frischkorn et al. [34]. For comparisons against the putative PhoA gene from T. erythraeum IMS101, OGs within the metagenomes were considered homologous to this protein if the blast e-value was <1 × 10−5 with a bit score >50 and contained UniRef blast homologs to the YP723031 gene from T. erythraeum IMS101 and other putative AP genes identified through KEGG or UniRef annotation [34]. For the AP OGs, we detected a single PhoX OG and two PhoA OGs termed PhoA1 and PhoA2 herein (Table S4). Additional nucleotide alignments were performed for all Trichodesmium-identified sequences in OGs classified as pstS/sphX, phoA, and phoX [34] along with representative sequences [29] of these genes from cultured isolates of Clade I, the dominant population in both regions (T. thiebautii, T. spiralis, and T. tenue). The MUSCLE application in Geneious (v.11) was used with default settings to run the alignment [64]. The culture pstS/sphX, phoA, and phoX sequences for T. thiebautii, T. spiralis, and T. tenue were obtained from GenBank under accession numbers FJ602760-602771. To conservatively approach the expression data, all reads mapping to gene variants within a given OG were pooled for subsequent analyses. OG counts were normalized to total mapped reads across each sample and only those OGs that had 1 read per million in at least three samples were included and considered detectable.

Differential expression analyses

Correspondence analyses (CAs) were conducted with the vegan package in R [65] using OG counts obtained from RSEM to examine differential OG expression patterns across samples. Vector fitting was done using the ‘envfit’ function in vegan with 9999 permutations to estimate the significance of the correlations between biogeochemical variables (Table S3) and the ordination by CA (p < 0.05), using the Benjamini–Hochberg (BH) algorithm [66] to control the false discovery rate (FDR). Statistical differences in global transcriptional patterns between NPSG and NASG communities were tested with a permutational multivariate analysis of variance (PERMANOVA) in R, using a Bray–Curtis dissimilarity matrix from the normalized OG table as the input for the ‘adonis’ function within the vegan package, with 9999 permutations [67, 68]. This analysis was repeated on the subset of OGs that were co-expressed in both the NPSG and the NASG. OG counts obtained from RSEM were also used to calculate differential expression with the package edgeR in R [69] for individual OGs in the NPSG compared with the NASG Trichodesmium sp. populations, treating each sample in the two regions as biological replicates. Default parameters were used to calculate dispersion of normalized counts for the replicates within each ocean basin so that these could be combined for further analyses [70]. Pairwise comparisons of combined counts (i.e., OG relative abundance) between ocean basins (NPSG versus NASG) were made with the exactTest function. FDR was controlled with the BH algorithm. FDR values <0.05 reflect statistically significant differences in OG relative abundance between NPSG and NASG communities (Table S4).

Results and discussion

Transcriptional patterns vary significantly between NASG and NPSG populations

Transcriptional patterns in Trichodesmium populations collected in the NASG and NPSG (Fig. S1) were examined to ascertain the geochemical drivers of Trichodesmium physiological ecology in these two ocean gyres. Of the 6710 Trichodesmium OGs in the custom metagenome database [34], a total of 4709 OGs were expressed by Trichodesmium communities in this study, of which 89% were common between the populations of the two oceans (Fig. S3). CA identified a significant difference (PERMANOVA: F = 4.51, p = 0.013) in global transcriptional pattern between NASG and NPSG Trichodesmium communities, with samples from each ocean basin segregating on the CA ordination plane (Fig. 1). Vector fitting showed that phosphate (\(PO_4^{3 - }\)), and the climatological average for dFe, nitrate + nitrite (\(NO_2^ -\) + \(NO_3^ -\)), and temperature (Temp) significantly correlated (p < 0.05) with the global transcriptional pattern of these communities, whereas salinity and PAR did not (Table S3). Si, which is not used by Trichodesmium as a resource, did not significantly correlate with the differences in transcription. Samples from the NPSG were placed in the region of the ordination plane where relatively higher \(PO_4^{3 - }\) concentrations were measured and samples from the NASG were placed in the region of the ordination plane where relatively higher dFe concentrations were recorded (Fig. 1). Similar patterns in the CA ordination, vector fitting, and PERMANOVA analysis (F = 4.31, p = 0.021) were observed when these analyses were run with the subset of OGs co-expressed in both the NASG and the NPSG (Fig. S4). Transcripts in Trichodesmium can turn over quickly in response to changes in geochemistry [29, 38] and are averaged here over both time and space, which would tend to minimize potential differences across ocean basins. Yet, the global transcriptional patterns associated with each basin were strikingly different and linked with the basin-scale climatology and geochemistry, particularly for Fe and P.

Fig. 1
figure 1

Global transcriptional patterns of Trichodesmium communities in the North Atlantic subtropical gyre (NASG) and North Pacific subtropical gyre (NPSG). Correspondence analysis (CA) of the global metatranscriptome. CA ordinations of communities of the NASG and NPSG are shown, with significant (p < 0.05) environmental vectors fitted using the vegan function envfit [65]. Arrows indicate the direction of the (increasing) environmental gradient, and their lengths are proportional to their correlations with the ordination. 95% Confidence ellipses are indicated for each of the sample types by ocean basin. Discrete values for nitrite and nitrate (\(NO_2^ -\) + \(NO_3^ -\)), phosphate (\(PO_4^{3 - }\)), dissolved iron (dFe) in the NPSG, the climatological average for dFe in the NASG, and temperature (Temp) used in this analysis are available in Table S1. *In the NASG, dFe was computed as the climatological average (Table S1)

Despite clear differences in the transcriptional patterns (Fig. 1), Trichodesmium populations were similar in the two regions. Samples in this study, with the exception of NPac_3, were concurrently collected with those of Rouco et al. [15, 16], confirming that Clade I dominated (94% and 92% in the NASG and NPSG, respectively) Trichodesmium field communities in both oceans. Additionally, the average percentage of reads mapping to sequences of each of the four Trichodesmium genomic bins defined in Frischkorn et al. [34] was consistent across regions, with bin 1 the most abundant across all samples, followed by bins 3, 2, and 9 (Fig. S2). Taken together, this evidence suggests that Trichodesmium populations were comparable between regions.

As a first approach to evaluate which pathways might drive the apparent differences in regional transcriptional patterns, OGs were grouped at the module level based on KEGG orthology (Fig. 2). Of the KEGG pathways with a Log2 fold change > 0.5, the metallic cation and Fe-siderophore module, including Fe metabolic pathways, was enriched with a higher proportion of reads in the NPSG relative to the NASG (Fig. 2). In contrast, the module including phosphate transport pathways was enriched in the NASG relative to the NPSG (Fig. 2). Taken together, the observed transcriptional patterns empirically corroborate models that suggest the relative importance of Fe and P in controlling phytoplankton productivity in these two ocean gyres [13, 71] and reflect ecosystem-specific geochemical drivers of Trichodesmium physiology, suggesting that low Fe is a driver of transcriptional patterns in the NPSG, whereas low P shapes observed transcriptional patterns in the NASG.

Fig. 2
figure 2

KEGG pathway enrichment in the North Atlantic subtropical gyre (NASG) and North Pacific subtropical gyre (NPSG). Differential expression of Trichodesmium communities from the NASG versus NPSG across KEGG pathways. Highlighted pathways are those with a Log2 fold change > 0.5 (above or below the dashed line). Bold outline indicates KEGG modules associated with iron and phosphorus transport. Here, aa denotes amino acid

Marker genes reflect ecosystem-specific traits that underpin Trichodesmium resource acquisition

The relative expression patterns of individual OGs were compared between the NPSG and NASG to examine specific functions outside of a KEGG framework (Table S4). NPSG and NASG Trichodesmium communities had significant differences in their transcriptional patterns at the OG level, with 30% of the OGs (1401 of 4709 OG) having significant (FDR < 0.05) differences (Table S4, Fig. S5). All of the individual OGs in this analysis were detectable in both the NASG and NPSG, indicating that significant differences in their expression across oceans were not derived from a complete lack of read mapping in a specific region. The relative abundance of the RotA OG, which has been used as a constitutively expressed marker gene [29], did not significantly differ between regions (Fig. S5). Interestingly, the relative abundances for OGs coding for the multi-subunit proteins of the metalloprotein, nitrogenase, involved in N2 fixation, including the Fe nitrogenase reductase protein (NifH) and the Fe-molybdenum dinitrogen reductases (NifD and NifK), were not statistically differentially abundant between NPSG and NASG populations (Fig. S5) where the range in N2 fixation is overlapping but highly variable [5, 55]. Although nif expression can track with Trichodesmium N2 fixation in culture, maximum nif expression generally occurs 2 h before the maximum nitrogenase activity [42, 72] and the nitrogenase enzyme can be regulated both at the transcriptional and post-transcriptional level [73]. As a result, the nif gene expression data derived here cannot be used to predict patterns in Trichodesmium N2 fixation rates with confidence.

To further constrain the relationship between Fe, P, and Trichodesmium physiology, we tracked a number of OGs corresponding to proteins previously characterized as markers of P or Fe stress. There were three main OGs that contained proteins that have been used in expression studies as markers of Fe stress in Trichodesmium communities: IdiA, a Fe (III) transport system binding protein; FeoB, an Fe (III) transport system Fe permease; and IsiB or flavodoxin, an Fe-free electron transfer protein that can replace ferrodoxin during Fe stress [38, 40, 44, 74, 75]. The relative abundance of two of the three OGs, IdiA and FeoB, were significantly enriched in Trichodesmium populations in the NPSG relative to the NASG (Fig. 3, Fig. S5). With a read mapping approach, we cannot fully discount the potential presence of sequence variants for these gene targets in the NPSG that could bias the OG enrichment pattern. However, if sequence variants for IdiA and FeoB were abundant in the NPSG, they would reduce the relative NPSG enrichment signal, not increase it. As such, our observations for these OGs likely represent only minimum levels of NPSG enrichment. IsiB relative abundance was not significantly different between the NPSG and NASG (Fig. 3, Fig. S5). Previous work identified IsiB expression consistent with Fe limitation in the South Pacific, but not the NASG using a quantitative reverse transcriptase-PCR (qRT-PCR) approach [44]. It is possible that IsiB expression is sensitive to the relative Fe and P geochemistry that differs between the NPSG and South Pacific, which could explain the lack of significant IsiB enrichment in the NPSG observed here. IsiB sequence variants have been detected in the South Pacific [44]. Such variants in primer sites are more of a concern for qRT-PCR than metatranscriptome profiling as applied here. However, we cannot exclude the possibility that NPSG sequence variants for IsiB may be influencing the OG enrichment pattern. Regardless, the enrichment of markers of Fe stress in the NPSG versus NASG populations (Fig. 3) appears to underpin the global transcriptional patterns (Fig. 1) and these data highlight the importance of Fe as a driver of Trichodesmium physiological ecology in the NPSG.

Fig. 3
figure 3

Marker gene expression enrichment in Trichodesmium field communities. Fold change in expression of orthologous groups (OG) used as markers of P and Fe limitation and Zn uptake (highlighted in blue, red, and green, respectively) in North Atlantic subtropical gyre (NASG) and North Pacific subtropical gyre (NPSG) communities. Genes listed for Fe and P were all experimentally validated as Fe or P-regulated in Trichodesmium [26, 2938,39,40, 44, 74]. Asterisks next to the horizontal bars indicate significance (***p < 0.001, **p < 0.05, *p < 0.1)

Similar to Fe, there are a number of P-scavenging OGs that have been identified as markers of P stress in Trichodesmium [26, 29, 39]. These include OGs involved in the acquisition of \(PO_4^{3 - }\), the preferred and most abundant source of P for marine bacteria [76, 77], as well as those involved in the acquisition of other P compounds, such as phosphonates and phosphoesters, which dominate the dissolved P pool in oligotrophic regions [76, 78]. The relative abundance of representative OGs for a number of P stress markers were statistically higher in Trichodesmium populations in the NASG compared with those of NPSG (Fig. 3, Fig. S5). These P stress markers included the high-affinity \(PO_4^{3 - }\) binding protein, PstS [29], the high-affinity phosphonate binding protein, PhnD [26], and the AP PhoX, which hydrolyzes P from ester-bond P compounds [29]. It is unlikely that the enrichment patterns observed for the expression of these P-related OGs are an artifact of read mapping. As highlighted above, Trichodesmium diversity is likely similar across regions, and there is a high degree of nucleotide identity for the phoX, phoA, and pstS genes between species and even different clades of Trichodesmium [29]. The maximum % nucleotide identity between sequences within the PstS, PhoA, and PhoX OGs and cultured isolates was >97% (Table S5). This indicates that, although the OG as a whole might contain some slightly divergent gene sequences, in each OG there is at least one sequence that is nearly identical to a gene in T. thiebautii, T. spiralis, and T. tenue. Thus, it is unlikely that expression differences observed for these OGs are consistently an artifact of read mapping. Again, these expression patterns corroborate the global transcriptional patterns (Fig. 1), as well as model results [13] and previous non-species-specific field experiments focused on AP activity and phosphate uptake [51,79,80,81]. Taken together, these data highlight the importance of P as a driver of Trichodesmium physiological ecology in the NASG.

Trichodesmium switches the relative transcript abundance of Fe- and Zn-requiring metalloenzymes for DOP hydrolysis

The PhoX-type AP is more common than the PhoA-type in marine bacteria [30]. Both metalloenzymes are present in the T. erythraeum, T. tenue, T. spiralis, and T. theibautii genomes [29, 30], and detected in Trichodesmium metagenomes [34]. In the metatranscriptome data here, the relative abundance of the PhoX-type AP was significantly higher in the NASG compared with the NPSG and the relative abundance of the PhoA-type AP was significantly higher in the NPSG (Fig. 3, Fig. S5). Both APs are known to be upregulated under conditions of P stress in Trichodesmium laboratory cultures, although patterns in PhoX transcription suggested that it was the dominant AP under Fe replete culture conditions [29]. The two APs have different substrate specificities and metal co-factors. PhoX can hydrolyze both phosphomono- and di-esters and contains calcium (Ca) and Fe as co-factors [31, 32]. By contrast, PhoA is typically specific to phosphomonoesters and contains zinc (Zn) and magnesium (Mg) co-factors [82, 83]. The patterns of expression of these two OGs across the two basins correlated with patterns of Fe and P biogeochemistry, with a higher PhoA/PhoX expression ratio in the NPSG where the average Fe is lower than in the NASG (Fig. 4). These data suggest that Trichodesmium may alter its PhoA/PhoX ratio to minimize its Fe requirement in low Fe regions like the NPSG. In addition, the expression signals of OGs involved in Zn homeostasis and uptake, such as ZnuA and ZnuC [84], were significantly enriched in Trichodesmium communities in the NPSG (Fig. 3), suggesting that there may be an increased Zn demand associated with the expression of the PhoA AP, which requires Zn. Tradeoffs associated with this strategy would be a higher Zn quota for populations in the NPSG and reduced bioavailability of some DOP substrates. Although there is limited information on Zn concentrations for these regions, average surface values are higher (~0.24 nM) in the NASG than in the NPSG (~0.07 nM) [47, 85,86,87]. It is not clear why Trichodesmium would favor a Zn-requiring enzyme, PhoA, in a low Zn environment like the NPSG, but the coincident enrichment of Zn transport functions suggests an increase in Zn demand, and that the Zn may be from the environment and not just recycled internally. Regardless, these data suggest a switch between Fe- and Zn-rich metalloenzymes for DOP hydrolysis consistent with climatological patterns in Fe geochemistry of the NASG and NPSG. Fe and Zn limitation of community AP activity had been previously shown using data from bulk AP activity assays, where AP increased after Fe or Zn additions in different ocean regions [33, 80]. This is most likely driven by shifts in the PhoA and PhoX expression ratios, as observed here for Trichodesmium, but not all marine bacteria carry both PhoX and PhoA like Trichodesmium. The apparent metalloenzyme switching observed here might underpin Trichodesmium fitness across variable oligotrophic ocean ecosystems by allowing cells to modulate trace metal requirements and mitigate Fe limitation of DOP metabolism while maintaining other high Fe-requiring processes such as N2 fixation and photosynthesis in low Fe regions [14].

Fig. 4
figure 4

PhoA-/PhoX-type alkaline phosphatase average expression ratios in Trichodesmium populations of the North Atlantic subtropical gyre (NASG) and North Pacific subtropical gyre (NPSG) with relevant geochemical climatology. a PhoA-/PhoX-type alkaline phosphatase average expression ratios. PhoA here represents the total PhoA signal where normalized values for PhoA1 and PhoA2 OGs were summed. Error bars indicate SEM. Asterisk (*) indicates significance (p = 0.009) using the Wilcoxon test in R. b Climatological averages of phosphate (\(PO_4^{3 - }\)) and dissolved iron (dFe) concentrations in the North Atlantic subtropical gyre (NASG) and North Pacific subtropical gyre (NPSG). The analysis includes surface (~5 m) climatological data from the same region (10°N-30°N and 70°W-25°W in NASG and 20°N-30°N and 150°W-160°W in NPSG) as this study collected throughout different years and seasons. White points indicate the average, and the bars correspond to the maximum and minimum number recorded in the region. R represents the ratio of the higher to the lower nutrient concentration. Notably, \(PO_4^{3 - }\) concentrations are ~6 times higher in the NPSG than the NASG. In contrast, dFe concentrations are ~2 times higher in the NASG than the NPSG. \(PO_4^{3 - }\) data from [28, 44, 46,47,48, 52,53,54,55], and this study. dFe data from [44, 45, 49,50,51, 54, 56], and this study

Conclusion

Taken together, the transcriptional patterns reflect ecosystem-specific geochemical drivers of Trichodesmium physiological ecology and empirically validate geochemical models that predict the importance of P in the control of Trichodesmium growth and N2 fixation in the NASG and Fe as a driver in the NPSG [13]. The expression of phosphate and phosphonate transport genes and AP genes highlights the importance of DOP metabolism in both ecosystems and the tradeoffs that this organism uses to maintain N2 fixation under different geochemical conditions. The findings also suggest that trace elements, such as Zn, should also be included in models given their influence on the activity of APs. Trichodesmium is predicted to increase growth and N2 fixation with elevated CO2 in the future ocean [88, 89]. Modeling the traits and tradeoffs observed here in the context of future ocean conditions will help predict concomitant impacts on C and N cycling and their control on marine primary production.