## Introduction

Nitrogen (N2) fixation, conducted by a group of specialized microorganisms called diazotrophs, provides the largest external nitrogen input into the ocean [1, 2]. Alleviating nitrogen limitation over a large portion of the global surface oceans, N2 fixation supports new production and net oceanic carbon uptake [3, 4]. Over geological time scales, N2 fixation is believed to compensate for nitrogen removal from denitrification and anammox [5]. The balance between these microbial processes influences oceanic productivity, export of carbon to the deep ocean and ultimately atmospheric CO2 concentrations [6, 7]. Therefore, understanding the factors regulating N2 fixation and diazotrophs is important for evaluating and predicting changes to nitrogen and carbon cycling in the global ocean.

The dominant environmental controls on N2 fixation rates and diazotrophs in the global ocean remain elusive. Various factors have been proposed as controls, including temperature [8], phosphorus [9], iron [10], nutrient supply ratios [11], zooplankton grazing [12], and a combination of multiple factors [13, 14]. Specifically, warm and oligotrophic subtropical waters enriched in iron while depleted in nitrogen are traditionally recognized as hot spots of N2 fixation and diazotrophs [11, 15]. In the North Atlantic, phosphorus and iron availability are believed to be the major factors regulating diazotroph distribution and activity [9, 14, 16,17,18]. However, observations used in these analyses are mostly distributed in the central and eastern North Atlantic, leaving the western and temperate North Atlantic relatively undersampled. In addition, although pioneering studies on N2 fixation date back to the 1960s in the Sargasso Sea [19], much of the early work focused on large-size Trichodesmium colonies captured via plankton nets [20]. This sampling technique may overlook the potential role of small unicellular cyanobacterial (e.g., UCYN-A) and non-cyanobacterial diazotrophs, whose importance has become increasingly recognized [21, 22]. Thus, aside from Trichodesmium, the diazotrophic community in the western North Atlantic remains poorly characterized. Since N2 fixation is the cumulative product of the diazotrophic activity of different microorganisms, a more comprehensive characterization of diazotrophs is required to resolve the environmental controls on N2 fixation in the North Atlantic.

To that end, our study builds on the survey of N2 fixation in Tang et al. [23], revisits the hot spots along the US Eastern Seaboard, and extends observations to higher latitudes (30–43° N) using a method we recently developed to estimate N2 fixation continuously [24]. We couple these measurements with a newly developed high-frequency autonomous microbial molecular sampling approach to contextualize these rates with diversity and gene expression patterns of N2-fixing organisms (Robidart et al., unpublished). This unprecedented combination of high-resolution N2-fixation observations and molecular sampling allows us to capture and characterize episodic diazotrophic blooms and evaluate their environmental controls.

## Materials and methods

### Sample collection and analyses

During a 10-day cruise between July 29 and August 7, 2017 traveling north from Bermuda to the Nova Scotia coast, then south to the New Jersey coast and finally back to Bermuda (Fig. 1), surface seawater (~5 m) was continuously pumped via a trace metal clean tow-fish system (GeoFish, manufactured by University of California Santa Cruz) into the laboratory aboard the R/V Atlantic Explorer [25]. This seawater was used for nutrient, trace metal, and microbial molecular sample collection and for N2 fixation incubations. Nutrients (nitrate + nitrite and phosphate) were sampled in acid-washed 15 ml polypropylene tubes (Falcon, Corning, USA) and immediately preserved in a −20 °C freezer. Nutrients were then analyzed colorimetrically with an automatic nutrient analyzer on land [26, 27]. The detection limits for nitrate + nitrite and phosphate were 0.01 μM and 0.014 μM, respectively. The trace metal samples were collected directly in a laminar flow hood within a custom-built clean bubble. Samples were filtered through 0.2 μm cartridge filters (Sartobran 300, Sartorius, Germany) into 60 ml LDPE bottles cleaned following GEOTRACES protocols [28]. Samples for dissolved manganese (DMn), copper (DCu) and iron (DFe) were acidified on board with HCl (ultrapure, Merck), then stored at ambient temperature and finally analyzed by SF-ICP-MS back in the lab on land [29]. A detailed description of the trace metal sampling and analyses can be found in Tang et al. [23].

### Measurement of N2 fixation, primary production, and net community production rates

Surface N2 fixation rates shown in Fig. 1a were measured continuously underway via Flow-through incubation Acetylene Reduction Assay by CAvity ring-down Spectroscopy (FARACAS) [24]. Briefly, acetylene (C2H2) is dissolved into filtered seawater and is then mixed with surface seawater collected continuously via a tow-fish, reaching a 10% C2H2 saturation. The mixed seawater is pumped into a 9 L flow-through incubation chamber at ~100 ml min−1 leading to a 90 min residence time for the flow-through seawater. After incubation, the ethylene (C2H4) from the incubated seawater is extracted by a bubble column reactor using a ~35 ml min−1 C2H4-free stripping gas and then measured by cavity ring-down spectroscopy (CRDS). CRDS measures C2H4 concentration every few seconds at high precision. Bypassing the incubation chamber every few hours, C2H4 background in the mixture of C2H4 tracer and seawater is determined. The C2H4 concentration difference between incubation and bypass can be used to calculate the near real-time C2H4 production rates. Finally, the C2H4 production rates are converted to N2 fixation rates using a 4:1 ratio. The detection limit of FARACAS is ~0.19 nmol N L−1 d−1, which is comparable to the discrete 15N2 addition method. Surface N2 fixation and primary production rates at 22 discrete stations were measured in triplicate using the dissolved 15N2 and 13C addition methods. The detailed description of the isotope incubation protocols can be found in Berthelot et al. [30]. N2 fixation rates were considered as detected when the isotopic enrichment was higher than 0.00146 atom% [31]. This led to detection limit ranging 0.11–2.23 nmol L−1 d−1 (depending on the particulate N collected on filters). Using this criterion, N2 fixation was detected at 21 of the 22 stations.

Net community production (NCP) rates were concurrently estimated based on continuous dissolved O2/Ar measurements by equilibrator inlet mass spectrometry [32]. Since O2 and Ar have similar solubility properties while Ar is biologically inert, Ar-normalized O2 can infer biological O2 production by removing physical influences on O2 concentration [33]. NCP represents the difference between primary production and community respiration, which can be used as a proxy for new production and/or export production. The contribution of N2 fixation to NCP was assessed by estimating their carbon fixation potentials with conversion factors of C:N = 106:16 and C:O2 = 1:1.4 following Tang et al. [23]. Assuming constant C:N and C:O2 ratios introduces uncertainties in our estimates. However, while variation in the conversion factors may affect the absolute contribution of N2 fixation to NCP, they have smaller effects on the spatial pattern.

## Molecular sampling

Over 200 molecular samples were collected at high frequency using a newly developed autonomous filtration system: Robotic Cartridge Sampling Instrument—RoCSI (Robidart et al., unpublished) (Fig. 1b). Briefly, around 2 L of seawater taken from the tow-fish system was filtered through a 0.2 µm Sterivex filter (Millipore, MA, USA) approximately every hour. The Sterivex filters were immediately preserved by adding RNAlater solution and stored at room temperature until transferred to a −80 °C freezer within 24 h. In addition to hourly sampling by RoCSI, we collected over 100 discrete molecular samples in triplicate nearly every 6 hours (i.e. ~4 times daily) from surface underway line and from 10 CTD stations. Depending on particle concentrations and biomass, ~1.5 L to over 4 L of seawater was filtered on each Sterivex filter using a peristaltic pump. The Sterivex filters were flash-frozen in liquid nitrogen and stored at −80 °C. These filters were later used for DNA and RNA analyses. Data obtained from 34 RoCSI samples as well as 9 discrete molecular samples collected at around noon local time daily are presented in this study (Fig. 1b).

### DNA and RNA extractions

DNA and RNA were extracted from the Sterivex filters using the AllPrep DNA/RNA Mini kit (Qiagen, MD, USA), including an initial bead-beating step (2 min at 30 Hz in a Qiagen TissueLyser) in RLT + buffer before following the manufacturer’s instructions. DNA was digested from RNA samples using a RNase-free DNase set (Qiagen, MD, USA) and the RNA Clean & Concentrator kit (Zymo, CA, USA). DNA and RNA were eluted in 50 μl EB buffer and 50 μl RNase-free water respectively. The DNA and RNA extracts were quantified using the QubitTM DNA/RNA HS Assay Kit (Invitrogen, CA, USA) according to the manufacturer’s guidelines. In addition, RNA purity and integrity were checked using RNA chips on an Agilent 2100 Bioanalyser (Agilent, CA, USA).

### nifH PCR amplification, amplicon sequencing, and sequence analysis

Diazotrophic communities were characterized by sequencing nifH genes from nine discrete samples collected at around noon during the cruise (Fig. 1b). DNA extracted from stations 7, 8 and 9 were pooled together due to their low DNA concentrations. nifH genes were amplified using a nested PCR protocol [34], using second round PCR primers (nifH1 and nifH2) containing Illumina tag sequences (Supplementary Table 1). nifH amplification and PCR conditions are described in Turk et al. [35]. Equimolar concentrations of nifH amplicons were pooled and sequenced using the Miseq-Illumina platform at the University of Southampton’s Environmental Sequencing Facility. The sequencing data have been deposited in the NCBI Sequence Read Archive under accession number PRJNA554315.

We used QIIME1 to process the amplicon sequencing data [36]. Raw sequences were merged and quality filtered, whereby sequences with a quality score <20, expected number of errors >1, and reads <200-bp long were removed using USEARCH [37]. On average, 37,366 sequences per sample were obtained after quality control (min = 33,500 and max = 43,333). Sequences were further clustered into operational taxonomic units (OTUs) at 97% sequence identity after removing chimeras via UPARSE [38]. Representative sequences from nifH OTUs were assigned putative taxonomies using the BLAST resource on NCBI yielding a final dataset of 550 OTUs.

### Quantification of nifH gene copies and nifH gene expression

The abundance and expression of nifH genes from selected major diazotroph taxa were quantified using quantitative PCR (qPCR) and reverse transcription qPCR (RT-qPCR), including Trichodesmium, UCYN-A, and Crocosphaera [39]. Complementary DNA (cDNA) was synthesized from purified RNA using the QuantiTect Reverse Transcription Kit (Qiagen, MD, USA) following the manufacturer’s guidelines, using equimolar quantities (0.25 μM) of nifH2 and nifH3 primers [35] and 10 ng of purified RNA extract. Negative controls (no-RTs) were generated for each sample in parallel. Protocol of qPCR reaction conditions and thermocycling parameters are described in Robidart et al. [40]. Briefly, the reactions were prepared in 96-well optical reaction plates with optical caps (Applied Biosystems, CA, USA) and run on a Real-time PCR System (Roche LightCycler® 96 Instrument, Germany) with the following thermocycling settings: 95 °C for 75 s, 45 cycles of 94 °C for 15 s and 59 °C for 30 s. The sample reactions (30 μl) were run in triplicate or duplicate and contained 15 μl of AccuPrimeTM SuperMix I (Invitrogen, CA, USA), 2.4 μl of 25 mM MgCl2 solution, 9.86 μl of nuclease-free water (Ambion, CA, USA), 0.12 μl each of the forward and reverse primers (0.4 μM final concentration), 0.5 μl of FAM-labeled TaqMan probe (0.2 μM final concentration) and 2 μl of template DNA. Standard dilution series ranging from 100 to 107 gene copies were used for each reaction. Negative controls with 2 μl of nuclease-free water (Ambion, CA, USA) instead of the template DNA were run on each plate to check for contamination. No amplification was observed across all negative controls. Samples with no amplification are denoted as below the detection limit (35–133 copies L−1) while samples with only one or two amplifications in a triplicate run are denoted as detectable but not quantifiable. The efficiency of the qPCR and RT-qPCR reactions ranged from 98.6% to 107.2%, with an average of 103.7%.

### RT-qPCR of phosphorus stress marker gene expression in diazotrophs

To characterize the physiological status of diazotrophs, we designed primers and standards targeting the pstS gene for phosphorus stress for both Trichodesmium and UCYN-A (Supplementary Table 1). UCYN-A pstS genes were collected from GenBank and aligned. Sequences were highly conserved and qPCR primers were designed directly from these sequences. For Trichodesmium, a metatranscriptomic sequence database from the Atlantic Ocean was used to collect pstS genes. They were aligned with pstS genes from GenBank, and qPCR assays were designed based on conserved regions in the gene. Genes were amplified from cruise samples and cloned as described below.

pstS amplicons were ligated to the pGEM T-Easy vector and cloned into TOP10 cells (Invitrogen). Plasmids were purified from cultured clones and sequenced to confirm specificity. Sequenced plasmids were used as standards. Standard curves from 100–107 copies were run on each plate with cDNA/DNA extracts from the cruise, and standard curve efficiencies were 103.2% (Trichodesmium) and 99.2% (UCYN-A). In addition to sequencing from the cruise samples, specificity was determined by comparing nifH copy numbers to pstS copy numbers (Supplementary Fig. 1), with the expectation that there should be a 1:1 relationship if primer sets are specific (n = 7, r = 0.99, q < 0.01). pstS gene expression was quantified using RT-qPCR to evaluate potential nutrient limitation in diazotrophs collected near noon across large spatial gradients (stations 1–9). We normalized pstS transcription per liter seawater to the diazotroph nifH gene abundances per liter, and we reported both transcripts per liter and transcripts per nifH copy for Trichodesmium and UCYN-A.

## Results and discussion

### Distribution of N2 fixation rates

Large spatial variations in surface N2 fixation rates were observed in the northwestern North Atlantic, ranging from below the detection limit (0.19 nmol N L−1 d−1) to over 167 nmol N L−1 d−1 (Fig. 1). FARACAS N2 fixation (gross rate) measurements agreed strongly with the results of discrete 15N2 incubations (net rate) over a wide range of rates (n = 22, r = 0.7, p < 0.01; Supplementary Fig. 2). While the warm and oligotrophic Sargasso Sea is traditionally believed to harbor conditions favorable to N2 fixation, we found relatively low N2 fixation rates across this region (geometric mean of 3 nmol N L−1 d−1). N2 fixation can however still occasionally support a large fraction of local NCP and net primary production (NPP) due to lower NCP and NPP offshore (Fig. 2).

High N2 fixation rates were primarily located in cold subpolar North Atlantic and coastal waters from Nova Scotia to the Mid-Atlantic Bight (geometric mean of 43 nmol N L−1 d−1). These waters were more nutrient rich than the adjacent subtropical waters. Intense N2 fixation in coastal waters has recently been reported by Tang et al. [23]. However, such high N2 fixation rates as we observed in the subpolar North Atlantic have rarely been detected in other subpolar oceans, deserving further exploration. N2 fixation has been reported at measurable rates in the Arctic Ocean [41, 42] and in temperate areas [43]. The elevated rates that we measured off the cold Nova Scotia coast (above 25 nmol N L−1 d−1 in 13 °C seawater) further extends the distribution of high N2 fixation rates (Fig. 1). Our observation of disproportionately large N2 fixation fluxes in subpolar and coastal waters confirms the high contribution of N2 fixation in North American continental shelf waters, which Mulholland et al. [44] estimated to be 0.28 Tg N yr−1 between Cape Hatteras and Nova Scotia. Estimates of N inputs from N2 fixation in the North Atlantic range between 4.3 and 89.6 Tg N yr−1 [45], and have mostly been derived from tropical and subtropical regions. Our results argue for the inclusion of coastal N2 fixation when estimating regional and global N2 fixation fluxes, and for the inclusion of N2 fixation, rarely considered in coastal studies [46, 47], when evaluating N budgets in coastal waters.

Intensive N2 fixation activity observed near the New Jersey coast reached the top 1% of N2 fixation rates ever reported in the literature [49]. The annual recurrence of these diazotrophic blooms (observed in 2015 and 2017 summers shown in Fig. 1a) makes this region one of the most significant N2 fixation hot spots in the global ocean, alongside the tropical North Atlantic [20], the western tropical South Pacific [50] and the coast of the eastern Arabian Sea [51, 52]. However, we only captured a snapshot of this diazotrophic hotspot in summer as opposed to observing the full pattern of seasonal variability. Recent studies have started to shed light on the temporal patterns of N2 fixation in the western North Atlantic coastal waters: the peak season of N2 fixation ranges from spring to fall depending on the location [44]. The mechanisms driving such seasonality remain unclear, warranting further study. Phytoplankton blooms in the Mid-Atlantic Bight are generally thought to be enhanced by nutrients supplied from physical mixing, upwelling [53], and riverine runoff [54]. Here we demonstrated that N2 fixation could also be an important source of new nitrogen, contributing substantially to both NCP (>80%) and NPP (>30%) in this area (Fig. 2). We should however interpret the fraction of NCP fueled by N2 fixation with caution because of differences in the integration time scales of our methods. FARACAS estimates hourly and daily N2 fixation rates while O2/Ar-NCP observations integrate productivity over a few days in this region. However, simultaneous measurements of N2 fixation and NPP in 15N2- and 13C-addition incubations, shown in Fig. 2b, avoid the issue of reconciling different measurement time scales and corroborate the role of N2 fixation in the supporting coastal marine production. N2 fixation and marine production are thus closely linked, albeit with significant spatial variations in the western North Atlantic (the contribution of N2 fixation to NPP ranged from 5.6% to 42%). These results also underline the need to further investigate N2 fixation in poorly sampled and unexplored regions.

### Diazotrophic communities

During the transition from regions of low to high N2 fixation, the diazotrophic community shifted dramatically from Trichodesmium and non-cyanobacterial diazotrophs (e.g., γ−24774A11) in the subtropical gyre to UCYN-A in the subpolar gyre and coastal areas (Fig. 3). Trichodesmium accounted for over 50% of nifH gene sequences at stations 1–2. In contrast, UCYN-A constituted a majority portion of nifH gene sequences at stations 3–5. One of the unidentified diazotrophs observed in the subtropical gyre matched an organism (accession number: AF016613.2 in NCBI) sampled within the same region reported in Zehr et al. (1998) [55]. Over our study area, low N2 fixation rates were observed when non-cyanobacterial diazotrophs were abundant. In addition, Crocosphaera and Richelia were not detected in nifH amplicon sequencing. Previous studies have also reported low Crocosphaera abundance in the North Atlantic [56]. Nevertheless, a portion of the diazotrophic community may not be amplifiable with the sequencing primers used in this study, as recent metagenomic deep sequencing has revealed diverse nifH phylotypes that would not be amplified with existing universal nifH primers [22]. Different spatial shifts of diazotrophs distribution have been previously observed in the North Atlantic [17, 57] and in the western Pacific [58, 59]. However, the factors driving such changes in diazotroph communities vary spatially. Therefore, it becomes critical to understand the conditions leading to the shift of diazotrophs and dominance of UCYN-A in coastal waters, as this organism is likely responsible for one of the highest marine N2 fixation rates ever reported.

### Factors controlling the distribution of N2 fixation

The spatial distribution of N2 fixation rates paralleled dissolved inorganic phosphorus (DIP) concentrations (Fig. 4 and Supplementary Fig. 3). Median N2 fixation rates were nearly 8 times higher when DIP concentration exceeded 0.05 μM north of the Gulf Stream (Fig. 5). High DIP concentrations in the subpolar gyre and coastal waters may result from multiple sources including mixing and riverine runoff [60, 61]. In contrast, DIP was depleted in the western subtropical North Atlantic [62]. Our results are consistent with modeling work suggesting phosphorus limitation of N2 fixation in the subtropical North Atlantic [63]. Although the transport of excess phosphorus across the Gulf Stream has been hypothesized to sustain N2 fixation in the subtropical North Atlantic [64], we observed relatively low N2 fixation rates that are comparable to previous estimates [49] in the subtropical North Atlantic.

Iron is another potential factor limiting N2 fixation [16]. We measured relatively high DFe concentrations (0.56–1.83 nM) over our research area (Supplementary Fig. 3), similar to previous observations [65]. These DFe concentrations were generally higher than the iron half-saturation constant for the growth of Trichodesmium [66]. While N2 fixation was highest in regions with elevated DFe (Fig. 5), a stoichiometric excess of DFe ($${\mathrm{Fe}}^ \ast = {\mathrm{DFe}} - {\mathrm{R}}_{{\mathrm{Fe}}} \ast {\mathrm{DIP}}$$; where $${\mathrm{R}}_{{\mathrm{Fe}}} = 0.47{\mathrm{mmol}}\;{\mathrm{DFe}}:1{\mathrm{mol}}\;{\mathrm{DIP}}$$) [67] suggested that iron was not limiting within our study area (Supplementary Fig. 3), an implication in line with recent nutrient addition experiments in this region [68]. However, diazotrophs generally have a higher iron requirement compared to other phytoplankton, e.g., Trichodesmium’s iron use efficiency is fourfold lower than phytoplankton growing on ammonium [69]. The excess DFe at some coastal stations could disappear (and in fact become negative, indicative of iron limitation) if we increase the DFe:DIP ratio (i.e. RFe) by approximately tenfold. Furthermore, iron bioavailability to microorganisms is linked to its organic complexation, rather than the total concentration. Most dissolved iron is present in strong complexes in this region [70]. However, the availability of this strongly complexed iron to diazotrophs is not well understood. Overall, we propose that a fraction of the excess phosphorus supply to the surface waters north of and near the Gulf Stream is assimilated by diazotrophs, supporting intense N2 fixation where iron is available. In contrast, the subtropical gyre harbors less N2 fixation due to phosphorus depletion (Fig. 4).

Other good predictors of N2 fixation in this study include some trace metals e.g. DCu and DMn, and chlorophyll-a concentrations ([Chl]) (Fig. 5 and Supplementary Fig. 4). Although high concentration of copper may be toxic to diazotrophs and phytoplankton [71], a positive correlation was found between N2 fixation rates and DCu. DCu concentrations along the transect ranged from 0.68 to 4.2 nM near the coast, typical of this region [72, 73]. However, the presence of very strong copper-complexing agents in seawater can reduce the available copper concentration to levels low enough to limit phytoplankton growth. For example, diatoms have been demonstrated to be co-limited by copper, iron, and light availability in the Northeast Pacific [74]. In addition, copper may limit denitrification activity of denitrifying bacteria [75]. Higher DCu concentrations could support denitrification in the coastal sediment and anoxic waters, removing bioavailable nitrogen and creating conditions favorable to N2 fixation. Manganese has been shown to be essential for some terrestrial N2-fixing bacteria [76, 77], but the physiological requirement for manganese in marine diazotrophs is not well understood. Manganese shares some sources with DIP in the ocean, including lithogenic dust deposition, sediments, and rivers [78, 79]. A relationship between DCu, DMn, and N2 fixation activity may simply reflect supplies of DIP and Fe or co-occurrence with other unrecognized factors beneficial to N2 fixation in our study area (Supplementary Fig. 5). Finally, N2 fixation rates were highly positively correlated to [Chl] (Supplementary Fig. 4). This result further emphasizes the presence of N2 fixation in highly productive regions [23, 80] in addition to in oligotrophic areas. However, it is unclear whether N2 fixation drives increases in [Chl] and marine production or whether N2 fixation is enhanced after depletion of nitrogen relative to phosphorus and release of organic matter by phytoplankton blooms.

### Factors controlling the distribution of diazotrophs

Absolute nifH gene abundances quantified by qPCR indicate that Trichodesmium was more abundant in the subtropical gyre while UCYN-A dominated in the subpolar and coastal oceans (Fig. 4), paralleling spatial patterns of their relative abundances shown in Fig. 3. Temperature may be the dominant factor driving such community shifts, with phosphorus availability acting as a secondary influence (Fig. 6). Trichodesmium reached maximum abundances (3.7 × 105 nifH copies L−1) at stations 2 and 7 near or within the Gulf Stream, where seawater was warm (28 °C) and DIP was low. Their nifH gene transcription was also high here compared to other regions (Fig. 4). This pattern may be driven by Trichodesmium’s preference for higher temperatures with an optimal growth temperature at 27 °C [81] and their ability to exploit diverse sources of dissolved organic phosphorus [82]. Our results support findings in earlier studies [83], suggesting that the Gulf Stream serves as a transport highway for Trichodesmium. This strong current could carry Trichodesmium from the tropical Atlantic and along the Eastern American coast to temperate oceans (Supplementary Fig. 6), which may explain why Trichodesmium has been captured at high latitudes [84], e.g., near the British Isles [85]. Thus, the Gulf Stream may transport diazotrophs and nitrogen incorporated via N2 fixation in addition to its physical transport of nutrients [86]. Transport of Trichodesmium is also observed in the Kuroshio Current in the North Pacific [87], indicating the potentially widespread role of strong western boundary currents in transporting diazotrophs. Although Trichodesmium was also present in coastal and cold waters, their nifH gene transcription was undetectable (Fig. 4). This may indicate N2 fixation may not be the main N source for Trichodemsium in these regions. The activities and preferred nutrient strategies of Trichodesmium in the cold higher latitudes warrant further study.

In contrast, UCYN-A was more abundant in subpolar and coastal areas where temperature was low and phosphorus was more available despite the presence of nitrate. UCYN-A abundance estimated by nifH qPCR agrees with the results from quantitative 16S rRNA amplicon sequencing (Supplementary Fig. 7). UCYN-A abundance reached 4 × 107 copies L−1 near the New Jersey coast (Fig. 4), among the highest abundances ever reported in the literature [49]. High transcription of the nifH gene of UCYN-A in coastal waters implies an active role in N2 fixation. Therefore, the highest N2 fixation rate across our transect was likely driven by this UCYN-A bloom. The hypothesized host of UCYN-A, Braarudosphaera bigelowii, also thrived in this area (Supplementary Fig. 8), with statistical co-occurrence patterns suggesting that it plays a key role in the regional microbial community (Wang et al., unpublished). The UCYN-A/B. bigelowii association has recently been reported in the waters off the coasts of California [88] and Brazil [89], suggesting a widespread distribution.

UCYN-A was more abundant in waters with higher DIP concentrations (>0.05 μM) (Fig. 6). This is consistent with Robidart et al. [40] at DIP concentrations below 0.2 μM in the North Pacific but contradicts the findings of Stenegren et al. [58] who showed a negative correlation between UCYN-A abundance and DIP concentration globally. This discrepancy highlights our limited understanding of the yet-uncultivated UCYN-A group, motivating further physiological observations. The highest UCYN-A abundance observed during our cruise occurred in ~24 °C seawater, in line with its predicted optimal temperature [8] and observed niches in the eastern Atlantic [56]. However, our observations revealed high abundances (>106 copies L−1) and expression (>105 copies L−1) in colder environments with temperatures as low as 13 °C near Nova Scotia (Figs. 4 and 6). This is also consistent with a recent report of UCYN-A actively fixing N2 in the cold waters of the Arctic Ocean [90]. These unexpected findings expand the temperature-based distribution of active diazotrophs. Some models may consequently be underestimating the N2 fixation budget by setting the threshold for N2 fixation based on the temperature ecophysiology of Trichodesmium, overlooking the role of other diazotrophs in temperate and polar regions. Similarly, the influence of nitrate on N2 fixation in models may need to be revised. Peak N2 fixation and UCYN-A abundance were found in coastal waters with nitrate and nitrite concentrations close to 200 nM (Figs. 5 and 6), reaching or exceeding half-saturation values of nitrate uptake in many marine phytoplankton cultures [91] and in natural phytoplankton assemblages from oceanic waters [92]. Several mechanisms have been hypothesized to explain N2 fixation in the presence of the reactive nitrogen [93,94,95,96] but the control of reactive nitrogen on UCYN-A and its hypothesized hosts deserves further investigation. Finally, iron enrichment may provide another advantage for UCYN-A to flourish in coastal waters while the effect from co-occurrent manganese is poorly understood (Fig. 6).

Overall, the shift in diazotrophic community structure between subtropical, subpolar and coastal waters and its relation to corresponding changes of environmental controls was further confirmed and visualized by a redundancy analysis (Supplementary Fig. 9): temperature predominantly determined diazotroph composition, with phosphorus and iron playing regulatory roles.

### Expression patterns of an inorganic phosphorus stress marker gene in diazotrophs

We further evaluated the nutrient stress experienced by the dominant diazotrophs by examining the transcription of the inorganic phosphorus stress biomarker pstS [97]. pstS codes for a phosphate-binding subunit of the transporter PstS and has been used to show Trichodesmium’s phosphorus stress in the North Atlantic subtropical gyre [98]. pstS expression in Trichodesmium was only detected in the subtropical gyre with the highest transcription observed at station 2 (Fig. 7a). In contrast, UCYN-A exhibited high pstS transcription in subpolar and coastal seawaters where DIP concentration was higher (Fig. 7b). However, pstS transcription normalized to nifH gene abundance was significantly lower in these subpolar and coastal regions compared to the subtropical gyre, indicating that phosphorus limitation may be stronger in the subtropical regions where DIP was depleted. Organic phosphorus concentrations were not measured on this cruise, but may also play a role. Both Trichodemsium and UCYN-A may have highest inorganic phosphorus stress in the subtropical waters, but Trichodesmium are able to assimilate organic forms of phosphorus to mitigate phosphorus stress in the North Atlantic subtropical gyre [98]. In addition to assessing the phosphorus stress, future work should target marker gene expression indicative of iron limitation.

## Conclusion

By applying high-resolution biogeochemical and biological sampling techniques, we captured the large-scale distribution of N2 fixation and diazotrophs from the subtropical North Atlantic to the US East Coast, with contrasting patterns between Trichodesmium and UCYN-A, the dominant diazotrophs in this study. The overlooked high N2 fixation fluxes in the subpolar gyres and in coastal oceans could be significant contributors to global marine N2 fixation. Substantial variations in diazotroph abundance over short distances were observed thanks to our high-frequency sampling techniques, further demonstrating the patchy spatial distribution of diazotrophs. Further, record abundances of UCYN-A quantified here are largely attributed to the high-resolution sampling scheme detecting bloom conditions, which would have been less likely to be captured by the traditional discrete CTD sampling.

Strong gradients in physical and chemical factors enabled the clear attribution of ecological drivers for N2 fixation and diazotrophs. Dramatic changes in N2 fixation rates (from undetectable to the highest 1% of rates measured globally) and shifts in diazotrophic community structure were best explained by phosphorus availability and temperature gradients, respectively. Taken together, our results imply that the environmental controls on N2 fixation rates and diazotroph composition may be different, with the best predictors of diazotroph abundances varying across species. For example, UCYN-A’s niches are distinct from other diazotrophs: high N2 fixation rates and UCYN-A abundance were unexpectedly observed in cold and nitrogen-present regions, challenging traditional assumptions and models about favorable conditions to diazotrophy. Therefore, modeling of N2 fixation ultimately requires observation and simulation of more diverse diazotrophic groups in addition to well-recognized Trichodesmium. Furthermore, deeper investigation of factors unobserved in this study, which may play important roles, such as top-down controls upon N2 fixation [12], would represent valuable avenues for future research. Nevertheless, our approaches for studying the ecophysiology of diverse diazotrophs and for conducting large-scale, high-resolution surveys over broad ocean regimes offer considerable potential for improving our understanding of N2 fixation and our ability to simulate N2 fixation in a changing climate.