Investigating the microbial ecology of coastal hotspots of marine nitrogen fixation in the western North Atlantic

Variation in the microbial cycling of nutrients and carbon in the ocean is an emergent property of complex planktonic communities. While recent findings have considerably expanded our understanding of the diversity and distribution of nitrogen (N2) fixing marine diazotrophs, knowledge gaps remain regarding ecological interactions between diazotrophs and other community members. Using quantitative 16S and 18S V4 rDNA amplicon sequencing, we surveyed eukaryotic and prokaryotic microbial communities from samples collected in August 2016 and 2017 across the Western North Atlantic. Leveraging and significantly expanding an earlier published 2015 molecular dataset, we examined microbial community structure and ecological co-occurrence relationships associated with intense hotspots of N2 fixation previously reported at sites off the Southern New England Shelf and Mid-Atlantic Bight. Overall, we observed a negative relationship between eukaryotic diversity and both N2 fixation and net community production (NCP). Maximum N2 fixation rates occurred at sites with high abundances of mixotrophic stramenopiles, notably Chrysophyceae. Network analysis revealed such stramenopiles to be keystone taxa alongside the haptophyte diazotroph host Braarudosphaera bigelowii and chlorophytes. Our findings highlight an intriguing relationship between marine stramenopiles and high N2 fixation coastal sites.

Marine phytoplankton mediate significant fluxes of carbon, oxygen, nitrogen, and other elements between organic, atmospheric, and oceanic pools. Nitrogen (N 2 ) fixation is a critical biogeochemical process in which specialized marine prokaryotes provide the surface planktonic community with a supply of new nitrogen obtained from the atmosphere. N 2 fixation supports marine primary production and can increase the sequestration of biological carbon in the deep ocean via sinking plankton biomass 1 . As a result, marine N 2 fixation is an important process influencing the ocean carbon cycle, with implications for marine carbon sink strength and global climate [1][2][3] .
Marine N 2 fixation and the diazotrophic microbes that perform this function are strongly influenced by nutrient availability, ecosystem conditions, and the ecology and structure of the surface microbial community 4-6 . Consequently, microbial oceanographers are devoting considerable effort to studying the links between N 2 fixation, the physical and chemical environment, and community structure. While studies of bottom-up controls on N 2 fixation are well underway 5,[7][8][9] , the complex ecological relationships governing diazotrophs remain only partially explored. For example, there is increasing field evidence regarding the significance of top-down factors in controlling diazotroph populations [10][11][12] , potentially influencing the global distribution of diazotrophs 13 . Our evolving understanding of such ecological dynamics stems from discoveries over the last decade that have significantly altered the scientific knowledge behind marine N 2 fixation. Researchers have discovered a number of additional marine diazotrophs, including UCYN-A-a cyanobacterial endosymbiont of the haptophyte phytoplankton www.nature.com/scientificreports/ Braarudosphaera bigelowii 14,15 -and numerous groups of non-cyanobacterial diazotrophs (NCDs), including members of Gammaproteobacteria and Planctomycetes 10, [16][17][18] . Other key discoveries include the finding that active marine N 2 fixation can occur in the presence of nitrate or ammonium [19][20][21] , that coastal rates of N 2 fixation are much higher than previously thought 22,23 , and that N 2 fixation can be significant even in cold high-latitude regions 20,24,25 , as reviewed in Zehr and Capone, 2020 5 . Such revelations also make it clear that the surrounding microbial community likely exerts ecological influences over diazotrophs in addition to the physical environment and nutrient regime. Colony-forming or hostassociated N 2 fixing bacteria may interact with other marine microbes inhabiting colony or host cell surfaces [26][27][28] . Community interactions might also provide diazotrophs with sources of organic-rich particles and aggregates 29 , and recycled essential nutrients such as phosphorus 30 . Grazers 11,31 , viruses 32 , and parasites represent other ways in which ecological dynamics might regulate N 2 fixing organisms and their hosts via predation upon the host algal cells. Finally, the marine environment itself can be altered by local microbial ecology, such as the transmissivity of the water column to light 33 or the presence of inhibitory compounds produced by competitors, particularly in more restricted coastal waters or at very high cell densities 34 . Marine microbial community structure thus potentially moderates N 2 fixation activity 35 . Hotspots of N 2 fixation might be considered an emergent property of the whole marine microbial community. Such possibilities add considerable value to uncovering ecological conditions or relationships between marine diazotrophs and other taxa within the surface ocean community. While conclusively linking N 2 fixation to microbial abundances and diversity requires understanding underlying ecological mechanisms, we remain at a stage where inferences from the field are necessary to highlight patterns of interest for focused study.
From August 2015 to August 2017, a successive series of research expeditions in the western North Atlantic aboard the R/V Atlantic Explorer sought to pair underway measurements of N 2 fixation with sampling of diazotroph and marine microbial community structure, biogeochemical properties, and a survey of net community production rates. Wang et al., 2018 described a marine microbial community dominated by Aureococcus anophagefferens associated with a coastal site in the Mid-Atlantic Bight exhibiting elevated net community production rates 36 . Using a new method for high-resolution underway observations of N 2 fixation, Tang et al., 2019 identified these productive coastal waters of the Mid-Atlantic Bight and Southern New England Shelf also to be hotspots of elevated N 2 fixation dominated by the cyanobacterial diazotroph UCYN-A, with extremely high rates up to 167 µmol N m −3 day −1 measured off the New Jersey coast 22 . Tang et al. 2020 analyzed macronutrient and micronutrient concentrations within coastal ecosystems with high rates of N 2 fixation. They discovered these environments to be relatively replete in iron and nitrate, identifying phosphorus and temperature as key factors governing N 2 fixation activity and diazotrophic community structure, respectively 37 . In the same study, a nifH molecular survey identified UCYN-A as dominant in cold, subpolar coastal waters with diazotroph populations shifting towards Trichodesmium in warmer, oligotrophic waters offshore.
In this study, we investigated the microbial diversity and ecological relationships underlying the published extensive regional survey of marine N 2 fixation rates in summer reported in Tang et al., 2019 and Tang et al., 2020. Specifically, through statistical inferences, we attempted to assess potential competitive interactions associated with previously-identified coastal hotspots of N 2 fixation, refining our understanding of the ecological niche occupied by diazotrophic plankton in this ecosystem. We hypothesize that given previous findings of record UCYN-A abundances, network-based statistical analysis of co-occurrence patterns will reveal that UCYN-A's host B. bigelowii plays a disproportionately crucial ecological role within this productive coastal community, potentially reflecting its role in transferring bioavailable N into the microbial environment.

Results and discussion
Microbial ecology: marine microbial community composition and spatial and abundance patterns. The earlier discovery that UCYN-A is most likely responsible for observed coastal peaks in N 2 fixation in the Mid-Atlantic Bight and Southern New England Shelf 22 is intriguing given that UCYN-A is presumed to be an obligate endosymbiont of its host Braarudosphaera bigelowii. UCYN-A's record abundances (up to 4 × 10 7 nifH copies L −1 ) 37 raise the question of what ecological conditions permit its host organism to be competitively successful and whether locally elevated N 2 fixation rates could be partially enabled by the success of the host phytoplankton in addition to the favorability of a niche for diazotrophic N 2 fixation.
In our molecular survey across the western North Atlantic (Fig. 1), marine prokaryotic 16S rDNA samples typically displayed high abundances of SAR11 and Prochlorococcus as well as appreciable fractions of Bacterioplankton belonging to the SAR86 clade (Oceanospirillales), Flavobacteriales, and Rhodospirillales. A noticeable shift in community composition towards greater fractions of Planctomycetes, Phycisphaerales, and Sphingobacteriales was observed for stations with high N 2 fixation rates and net community production (NCP) rates (Fig. 2).
Generally, eukaryotic 18S rDNA samples were dominated by dinoflagellate OTUs (Fig. 3), particularly Syndiniales-endoparasitic alveolates infecting a wide range of organisms from ciliates to cercozoa, zooplankton, and fish eggs [39][40][41] . The life cycle of Syndiniales also involves a free-living stage in which large numbers of Syndiniales are dispersed as dinospores, perhaps partially explaining their high abundances in many amplicon studies 39 . Eukaryotic dinoflagellates belonging to Gonyaulacales, Gymnodiniphycidae, and Dinophyceae appeared at around 5-10% relative abundances across most samples. www.nature.com/scientificreports/ Samples from the MAB and New England shelf exhibiting high N 2 fixation and NCP rates in 2015 and 2017 were dominated by Chrysophyceae and the pelagophyte alga Aureococcus anophagefferens. Strikingly, of the seven stations with the highest N 2 fixation rates, five showed extremely elevated Chrysophyceae abundances (25-55% of 18S rDNA abundances). Several of these samples also displayed elevated abundances of the chlorophyte Mamiellales. These samples from the MAB in August 2015 and 2017 clustered closely together during PCoA analysis, suggesting consistency in broad community structure between years (Fig. 4). Analysis of variance testing via the adonis function (999 permutations) in the "vegan" package revealed that the differences in community composition between 18 and 16S samples collected north of the Gulf Stream versus south of the Gulf Stream were statistically significant (p < 0.001).
Although we observed high N 2 fixation, high NCP, and high abundances of Chrysophyceae in the MAB in both August 2015 and 2017, we note that Aureococcus anophagefferens did not bloom here to the same extent in 2017 as in 2015 36 . In 2017, Aureococcus occurred at 0.7-10.1% of the eukaryotic community within the MAB, with an estimated 7.2 × 10 7 to 2.2 × 10 9 18S rDNA gene sequences l −1 , contrasted with 2015 sampling (1.8 × 10 8 to 2.0 × 10 10 rDNA genes l −1 ). The absence of a similarly dramatic 2017 Aureococcus bloom in the MAB suggests that such events may not recur annually, or that blooms may be subject to variable timing and magnitude. Existing literature reporting upon coastal monitoring of Aureococcus indeed demonstrate that Aureococcus abundances may fluctuate by an order of magnitude between months during the growing season and that the seasonal biomass peak may precede the start of August by several weeks 43 . Broad patterns: community diversity vs. N 2 fixation and NCP. We observed a negative relationship between N 2 fixation and eukaryotic community Shannon's H diversity (Pearson = − 0.73, p < 0.001; Spearman's rho = not significant), in which stations with lower eukaryotic diversity tended to exhibit higher N 2 fixation rates (Fig. 5). A similar negative relationship was observed between NCP rates and eukaryotic diversity (Pearson = − 0.70, p < 0.001; Spearman's rho = − 0.31, p < 0.05). These correlations appear to be driven by both by decreasing species richness and evenness, as samples drawn from locations with high N 2 fixation and NCP rates tended to contain substantially fewer OTUs, often two standard deviations lower than the rest of the dataset (mean: 397, s.d.: 131), yet were also dominated by particular taxa such as Chrysophyceae and A. anophagefferens. We note that the linear regressions shown in Fig. 5 are included as visual aids, and not meant to assert that these relationships are necessarily linear.
Prokaryotic Shannon's H diversity was weakly correlated to N 2 fixation rates (Pearson = 0.3, p < 0.01; Spearman's rho = 0.48, p < 0.01) but not with NCP (p > 0.05) (Fig. 5, Table S2). We also note that samples collected in 2016 typically exhibited a narrow range of eukaryotic diversity compared to the range observed for other years (  www.nature.com/scientificreports/ including macronutrient concentrations, dissolved and particulate trace metals, and silica, with trends discussed in detail in the Supplementary Material. These results are of interest for considering the ecological characteristics of the coastal hotspots of N 2 fixation and NCP sampled during this survey. Productivity-diversity relationships in oceanic ecosystems can provide insight into ecological processes involved, such as competitive exclusion 44 , "kill-the-winner" 45 , and co-existence.
Increasing competitive exclusion driven by opportunistic bloom taxa has been proposed as an explanation for the negative slope portion of unimodal productivity-diversity relationships 46 . Earlier studies suggested a unimodal "humped" shape to such relationships, in which diversity increases with productivity, peaking at intermediate levels of production, then falling 47,48 . More recent work has not only documented a range of productivity-diversity relationships ranging from positive to negative to flat 49,50 , but has also raised methodological concerns, postulating instead that diversity may not be correlated with productivity, with rare taxa impacting species richness but not greatly influencing production 51 .
Our results, showing negative relationships with decreasing eukaryotic diversity for increasing N 2 fixation and NCP, raise the possibility that competitive mechanisms could play a role in limiting eukaryotic diversity within www.nature.com/scientificreports/ the sampled coastal blooms. One potential mechanism is that the dominant eukaryotic plankton identified in our sampling, Chrysophyceae, and Aureococcus, can produce allelopathic compounds to inhibit the growth of algal competitors 34,52,53 . Given the record abundances of UCYN-A observed at these sites 37 , such patterns raise the possibility that UCYN-A and its host Braarudosphaera bigelowii can tolerate these chemicals. Aureococcus is adapted for low-light conditions and capable of leveraging diverse uptake pathways for dissolved organic carbon and nitrogen 54,55 while Chrysophyceae can engulf particulate matter via phagotrophy 56,57 , potentially giving both groups of mixotrophic algae versatile capabilities for out-competing other taxa. A positive relationship between N 2 fixation and prokaryotic diversity could imply a greater availability of varied bacterial niches within N 2 fixation hotspots, but the observed relationships in this study are too weak to confidently support such a hypothesis. An OTU-based definition for prokaryotic diversity is imperfect, given the ambiguity of the "species" concept for prokaryotes 58,59 . Observed relationships may depend on how diversity is defined and which functional groups (algae, free-living and particle-associated bacteria, viruses) are captured by field methods and considered part of the community. The observed relationships with eukaryotic diversity are also strongly driven by the samples taken in 2017 and 2015 off the New England Shelf, raising the possibility that patterns with diversity would shift with broader geographic sampling or data from summers in other years.    www.nature.com/scientificreports/ Furthermore, our dataset remains too limited to investigate the possibility that underlying relationships are curvilinear as opposed to linear. Different taxonomic definitions, such as amplicon sequence variants 60 could also affect biodiversity metrics, a topic that is beyond the scope of this study but worthy of further investigation.
Fine-scale relationships: key community members associated with N 2 fixation and NCP. Our partial least squares (PLS) regression analyses identified specific prokaryotic and eukaryotic taxa as jointly associated with high N 2 fixation and high NCP rates (Fig. 6). Among prokaryotes, abundances for all Planctomycetes taxa binned at the fourth taxonomic rank were related to N 2 fixation and NCP. 2 of 4 Actinobacteria taxa, 3 of 5 Verrucomicrobia, and 2 of 3 Betaproteobacteria also exhibited relationships with N 2 fixation and NCP, as did Arenicellales (Gammaproteobacteria), Sphingobacteriales, and Cytophagia Order III from within Bacteriodetes. PLS regression analyses performed at the genus level identified a strong relationship between UCYN-A, a haptophyte-associated, N 2 fixing group of unicellular cyanobacteria, and both N 2 fixation and NCP (Dataset S1g).
Planctomycetes are largely heterotrophic, can exhibit surface and particle-associated lifestyles 61 and have been observed at high abundances in coastal environments 62 . Consequently, relationships with productivity and diazotrophic activity could potentially reflect the particle-rich environment suggested by high quantitative abundances of Chrysophyceae and Aureococcus anophagefferens. A higher concentration of phytoplankton surfaces for growth might similarly explain N 2 fixation and NCP associations displayed by Bacteriodetes and Verrucomicrobia, whose marine representatives are also typically particle-associated [63][64][65] . Alternatively, recent research indicates that heterotrophic bacterial diazotrophs, including some Planctomycetes, are more widespread and abundant in the surface ocean than previously thought 17 , potentially contributing directly to N 2 fixation and thereby promoting new production.
Eukaryotic groups associated with both N 2 fixation and NCP included most cryptophytes, Ichthyosporea, Cercozoa, Prymnesiales, Retaria, Chlorophyta, and several marine stramenopiles (MASTs) (Fig. 6). The taxon www.nature.com/scientificreports/ Ochrophyta, containing the dominant stramenopiles within the coastal transect-Chrysophyceae and Aureococcus anophagefferens-was also predictably related to NCP and N 2 fixation. At the genus level, UCYN-A's hypothesized prymnesiophyte host, Braarudosphaera bigelowii, was related to N 2 fixation and NCP (Dataset S1h). PLS regression analyses of taxonomic abundances in relation to macronutrients, trace metals, and silica are shown in the Supplementary Material. Network and graph-alignment analyses revealed several eukaryotic taxa associated with N 2 fixation and NCP as hub organisms within their subnetworks or communities. These hub taxa showed high numbers of significant co-occurrence relationships, implying that they could play central community roles (Fig. 7). For example, Subnetwork 1 in Fig. 7 described a community associated with high N 2 fixation and productivity sites in the 2015 coastal bloom (Fig. S3). Within this community, network analysis identified marine stramenopiles as important hub taxa, with four stramenopile genera including two chrysophytes placing among the top 10 keystone nodes. Notably, our results highlighted Braarudosphaera bigelowii as the taxon with the single highest network centrality within its N 2 fixation and NCP-associated subnetwork, suggesting that this diazotroph-hosting prymnesiophyte may be ecologically influential. Taxa with high network centrality are critical to subnetwork stability by exhibiting a high number of connections with other OTUs. Other marine microbiome studies along the California coast 66 and off Brazil 67 have also found evidence that B. bigelowii may play important ecological roles within other microbial ecosystems. Equally interesting were taxa associated with N 2 fixation and NCP in PLS regression analysis but exhibiting few sub-network connections, such as cryptophyte algae and Aureococcus. Their elevated abundance in productive samples but low participation in network structure could potentially suggest that cryptophytes and Aureococcus benefit opportunistically from ecosystem conditions but possess fewer direct interactions with other microbes in their environment. Subnetwork 2 described a community generally associated with coastal sampling in 2015 and 2017 (Fig. S3). A similar centrality analysis revealed several chlorophyte genera-Micromonas, Bathycoccus, and Ostreococcus-as important keystone nodes. High network centrality associated with these taxa could conceivably reflect a potential role of these small, genetically-streamlined chlorophytes as prey within the community for larger mixotrophic and heterotrophic protists and a shared response alongside other taxa to favorable environmental conditions. That said, possibilities exist for more complex interactions-members of Micromonas in polar waters have recently been identified as mixotrophs, practicing bacterivory 68 . Our data cannot directly provide insight into these hypotheses. However, the high network centrality of chlorophyte taxa and their abundance in high N 2 fixation and NCP samples suggest that they may be ecologically important members of the productive coastal community sampled.
Stramenopiles, including several Chrysophyceae, were also identified as essential taxa within the low N 2 fixation, low-productivity Subnetwork 3 together with alveolates, driving much of the network structure within this module. To emphasize common patterns between stramenopiles in these subnetworks, we now examine the results of our graph alignment analysis. A graph alignment consists of identifying OTUs from two distinct communities that possess (i) similar marker gene sequences (i.e., sequence identity) and (ii) similar roles within the communities' topological structure (i.e., patterns of centrality). Aligned OTUs from different subnetworks imply similar ecological roles in their respective communities. A large fraction of the significant alignments between Subnetwork 3 and Subnetworks 1 and 2 involved stramenopiles and alveolates from Subnetwork 3. This alignment result suggests that a community transition between these conditions could be associated with the resilience of these groups of eukaryotes and their related ecological functions.
Our overall findings corroborated this result. The observed close correspondence between peaks in N 2 fixation and Chrysophyceae abundances supports the theory that stramenopiles potentially play a key role in our study region. However, this also highlights our lack of general understanding of the function and role of stramenopiles. To our knowledge, the oceanographic literature so far does not record any known direct association between Chrysophyceae and diazotrophs that could drive such a pattern, motivating the proposal of hypotheses to drive further studies. Chrysophyceae and diazotrophic activity could simply co-vary in response to a separate stimulatory factor. Alternatively, diazotrophic inputs of N might, through recycling via the microbial loop, act to promote Chrysophyceae growth. Mixotrophic Chrysophyceae possess diverse metabolic capabilities including heterotrophic uptake of organic matter using phagotrophy, potentially supporting such a hypothesis 56,57 . High abundances of Chrysophyceae might, on the other hand, create favorable conditions for UCYN-A and its host B. bigelowii through the competitive exclusion of competitors. Or, a direct interaction between Chrysophyceae and yet-uncharacterized N 2 -fixers could represent an unlikely but intriguing possibility. At any rate, this pattern of association between Chrysophyceae and high measured N 2 -fixing activity warrants a more detailed investigation driven by interaction studies.

Conclusion
Our study of prokaryotic and eukaryotic community structure within hotspots of N 2 fixation and NCP revealed an intriguing microbial ecology. We observed a strong negative relationship between eukaryotic diversity and both N 2 fixation and O 2 /Ar-derived NCP, a finding suggesting that low eukaryotic microbial diversity at productive coastal sites might potentially be driven by competitive exclusion. Notably, we observed high abundances of Chrysophyceae occurring in strikingly close conjunction with high N 2 fixation rates and NCP, along with elevated abundances of Aureococcus anophagefferens, chlorophytes, cryptophytes, and Planctomycetes. Such a community structure indicates that observed high N 2 fixation rates may be occurring in microbial ecosystems with a high potential to process and take up diverse forms of organic material and nutrients. Aided by our quantitative study design, our network analysis highlighted Braarudosphaera bigelowii, chlorophytes, and Chrysophyceae and other marine stramenopiles as potentially important community hub taxa. Overall, our findings point to notable taxa driving the ecological network structure of coastal N 2 fixation hotspots and, particularly in light of the  . The leftmost pair of identical axes represent taxa belonging to the first subnetwork, with edges between the left two axes signifying significant co-occurrence relationships between taxa within the subnetwork. The rightmost axes and lines similarly represent taxa from the second subnetwork and within-network co-occurrence relationships. Edges between the two networks represent aligned taxa, where an OTU shares a similar taxonomic identity (sequence similarity) and a high degree of network topological similarity with an OTU in the other subnetwork. Alignments suggest that OTUs fill similar ecological roles within their respective subnetworks. Circular nodes represent individual OTUs, colored and grouped at the third taxonomic rank. Larger circle diameter indicates higher network importance (centrality). Taxa are ordered by centered log-ratio (CLR) normalized abundance, with higher-abundance genera placed further from the origin. www.nature.com/scientificreports/ establishment of a new LTER site along the northeast U.S. coastal shelf (NES-LTER) 69 , can serve as a springboard for targeted investigations into the processes mediated by these microorganisms of interest.

Materials and methods
Underway measurements of N 2 fixation rates and net community production. Our combined set of new and previously-published data includes around 10,250 km of continuous and discrete measurements across the western North Atlantic Ocean over three summer voyages aboard the R/V Atlantic Explorer. Previously-reported surface layer N 2 fixation rates 22,37 were measured at high resolution using the Flow-through incubation Acetylene Reduction Assay by Cavity ring-down Spectroscopy (FARACAS) 38 , which utilizes a cavity ring-down spectrometer to measure the conversion of dissolved acetylene to ethylene by diazotrophs within surface seawater. Previously-reported measurements of NCP 22,36,37 were conducted using the dissolved O 2 /Ar method via Equilibrator Inlet Mass Spectrometry 70 . Detailed protocols and data analysis procedures can be found in previous publications 22,36,37 . We conducted a sensitivity analysis to assess the effect of vertical O 2 /Ar fluxes upon calculated NCP rates, detailed in the Supplementary Methods, which we deemed to be minimal. The cruise tracks, conducted from 3 to 13 August 2015, 3-12 August 2016, and 29 July-7 August 2017, are illustrated in Fig. 1, with associated physical and biogeochemical properties shown in Fig. S1 and S2. Our N 2 fixation technique calculates daily fixation rates, while NCP measurements integrate over 2-8 days, both reasonable timescales for comparisons between community structure and these rate measurements.    36 (see Supplementary Material for additional details). Following taxonomy assignment, we removed low read count samples (< 100 sequences) and filtered internal standard, plastid, mitochondrial, and metazoan rDNA sequences from our taxonomy tables. Phylogenetic trees were generated using the make_phylogeny.py script in QIIME, which utilizes the FastTree method. www.nature.com/scientificreports/ to that OTU in the sequencing output, divided by the product of the standard recovery ratio (# standard reads recovered / # standard reads introduced) and the volume filtered. The internal standard recovery ratios obtained in this study are shown in (Fig. S4). Details of filtration of low-count samples and outliers and rarefaction and ordination procedures are described in the Supplementary Material. The internal standard technique has previously demonstrated good agreement with other quantification techniques including flow cytometry counts, HPLC, phospholipid fatty acid (PLFA) analysis, and substrate-induced respiration (SIR) analysis [79][80][81] . However, the approach remains subject to limitations. This quantitative method does not account for variable rDNA gene copy number across bacterial and eukaryotic taxa. Prokaryotic rDNA copy numbers mostly vary between 1 and 15 copies per cell 82 , while eukaryotic rDNA gene copies cell -1 can vary by orders of magnitude. Additionally, the internal standard technique assumes that recovery rates are identical for internal standard reads and natural sequences, an assumption that primer, amplification, or DNA extraction biases could impact. Consequently, calculated abundances should not be interpreted as absolute estimates of organismal cell concentrations, but rather reflect the changing quantitative abundance of taxa across samples.

Statistical analyses.
Calculation of quantitative 16S rDNA gene abundances produced a discrepancy in which samples collected in 2015 displayed significantly higher total 16S rDNA abundances (Wilcoxon rank-sum test, p < 0.001) than 16S rDNA samples obtained in 2016 and 2017. This difference in calculated abundances likely results from quantitation or dilution errors introduced while synthesizing the varying 16S spike quantities and stock solutions between years as mentioned above and from differences in preservation protocols. To investigate this, we compared our calculated SAR11 rDNA copy numbers for stations within 400 km of Bermuda against reported summer SAR11 abundances of 1.5-2.0 × 10 8 cells/l for this region 83,84 . While calculated 2016 and 2017 SAR11 abundances (1.41 × 10 8 cells/l and 1.38 × 10 8 cells/l) aligned closely with values from the literature, samples from 2015 near Bermuda displayed a median SAR11 abundance of 6.5 × 10 8 cells/l. Consequently, we adjusted calculated 2015 16S rDNA abundances by a factor of 0.21. Adjusted 2015 organismal abundances for SAR11 and Prochlorococcus align closely with expected abundances from the literature (Table S2). We further cross-validated our corrected organismal abundances for Prochlorococcus against patterns of Prochlorococcus abundance obtained from concurrently-collected flow cytometry samples, observing good agreement in abundance patterns between the two datasets (Table S1, Fig. S5). The corrected 2015 16S rDNA dataset was subsequently used alongside the 2016 and 2017 datasets in generating taxonomy plots, ordination analyses, Partial least squares (PLS) regression analyses, and co-occurrence network construction.
Partial least squares (PLS) regression analysis was performed following Wang et al., 2018 36 upon 18S and 16S taxa at the fourth and sixth taxonomic ranks. The fourth taxonomic rank was the highest resolution for which heatmap figures are readily visually interpretable, while the sixth (genus) rank is the highest taxonomic level that the 18S and 16S rDNA amplicon can accurately and reliably resolve. Taxa at the fourth and sixth taxonomic ranks with read counts comprising < 0.5% of the total sum of 18S or 16S reads were filtered from analyses. Two separate PLS regression analyses were performed: one utilizing the full molecular dataset from all 3 years to assess taxon-specific relationships with NCP and N 2 fixation, and another utilizing only the community data from the subset of samples collected in 2016 and 2017 to investigate relationships between specific taxa and macronutrient and trace metal concentrations (see Supplementary Material). To handle missing data, the non-iterative partial least squares (nipals) algorithm was employed.
Network analysis. Weighted gene correlation network analysis (WGCNA) 85 was performed to select subnetworks of co-occurring marine microbial OTUs from the whole dataset. Following the WGCNA standard protocol 86 , we constructed an adjacency matrix of the microbial community dataset using Pearson correlations (R 2 cutoff = 0.85) 87 . A soft-thresholding power law (p = 8) was applied to fit the weighted graph into a scalefree topology. A topological overlap measure (TOM) was then calculated for each pair of OTUs, based on the weight of their pairwise correlations with close neighbors and weighted correlations with other OTUs. Finally, hierarchical clustering based on TOM was performed to identify sub-networks of co-occurring taxa. Two subnetworks (Subnetworks 1 and 2) that were most significantly positively associated with both NCP and N 2 fixation (Figs. S3, S6) were extracted, as well as the subnetwork (Subnetwork 3) most negatively correlated with these rates, although not significantly.
A global, robust network was built via FlashWeave 88 . Induced subgraphs, as identified from WGCNA, were then extracted. Subnetwork comparison was then performed using L_GRAAL, a graph alignment tool 89 . L_ GRAAL aligns OTUs from two distinct co-occurrence networks when they share (i) similar topological properties and (ii) rDNA sequence homology (alpha = 0.4 for accurate consensus between both features) 90 . Visualization of subnetworks and alignments was performed with HiveAlign. More details on specific steps taken for WGCNA, Flashweave, and L-GRAAL analyses can be found in the Supplementary Material. www.nature.com/scientificreports/