Empirical leucine-to-carbon conversion factors in north-eastern Atlantic waters (50–2000 m) shaped by bacterial community composition and optical signature of DOM

Microbial heterotrophic activity is a major process regulating the flux of dissolved organic matter (DOM) in the ocean, while the characteristics of this DOM strongly influence its microbial utilization and fate in the ocean. In order to broaden the vertical resolution of leucine-to-carbon conversion factors (CFs), needed for converting substrate incorporation into biomass production by heterotrophic bacteria, 20 dilution experiments were performed in the North Atlantic Ocean. We found a depth-stratification in empirical CFs values from epipelagic to bathypelagic waters (4.00 ± 1.09 to 0.10 ± 0.00 kg C mol Leu−1). Our results demonstrated that the customarily used theoretical CF of 1.55 kg C mol Leu−1 in oceanic samples can lead to an underestimation of prokaryotic heterotrophic production in epi- and mesopelagic waters, while it can overestimate it in the bathypelagic ocean. Pearson correlations showed that CFs were related not only to hydrographic variables such as temperature, but also to specific phylogenetic groups and DOM quality and quantity indices. Furthermore, a multiple linear regression model predicting CFs from relatively simple hydrographic and optical spectroscopic measurements was attempted. Taken together, our results suggest that differences in CFs throughout the water column are significantly connected to DOM, and also reflect differences linked to specific prokaryotic groups.


Results
Hydrographic characterization of the study area. The location of the sampling stations across both sections (Finisterre and Santander) is shown in Fig. 1. At each station, the hydrographic properties found throughout the water column (Table 1 and Supplementary Information, Fig. S1) were used to select the sampling depths (see Experimental procedures). Hydrographic conditions were similar in both sections, particularly in the epipelagic and upper mesopelagic layers. In both sections, lower mesopelagic waters showed the minimum oxygen concentration (Oxy) at ~ 950 m related to the signal of the Mediterranean water, characterized by high salinity (Sal), particularly apparent in Finisterre (Table 1). In the bathypelagic layer, the lowest temperature and salinity, and relatively high mean dissolved oxygen concentration were recorded (Table 1).
Empirical leucine-to-carbon CFs and derived PHP throughout the water column. The values of the eCFs decreased with depth in both sections (Table 2), ranging from 4.00 to 0.65 in Finisterre and from 0.43 to 0.10 kg C mol Leu −1 in Santander. Importantly, eCFs determined in epi-and mesopelagic waters of Finisterre, were higher than the theoretical CF (1.55 kg C mol Leu −1 ), except for the lower mean eCF value (1.20 kg C mol Leu −1 ) measured at 1000 m depth sample in station 11 (Table 2). Conversely, eCFs in the bathypelagic waters were often lower than the theoretical CF particularly in Santander. Additionally, it is important to highlight that eCFs were significantly different between Finisterre and Santander sections (Student's t-test, P < 0.05), and among depths for the Finisterre section (ANOVA, P < 0.5). Consequently, estimating eCFs at different depths and locations is crucial for understanding PHP trends.
PHP ( Supplementary Information, Fig. S2) mirrored the variability of LIR and CFs with depth. The theoretical PHP (tPHP) displayed a quadratic (log-log) relationship with depth (Fig. 2). Because tPHP is a linear function of LIR multiplied by the constant theoretical CF (see Methods), log LIR versus log depth would also fit to a quadratic curve (data not shown). Maximum values of empirical PHP (ePHP) were found in the epipelagic and upper mesopelagic layers in both sections, being > twofold higher than tPHP. However, for the lower mesopelagic waters ePHP was ~ twofold higher in Finisterre but ~ fivefold lower in Santander compared to tPHP estimates. Finally, at the bathypelagic layer the constant theoretical CF overestimated PHP in Santander, while there were no significant differences in Finisterre between ePHP and tPHP. Hence, a stronger vertical gradient emerged for ePHP compared to tPHP at the Santander section ( Supplementary Information, Fig. S2). The depth-dependence of ePHP was also best resembled by a quadratic than by a linear model (Fig. 2), i.e., the slope of PHP versus depth varied with depth. There were no significant differences between averaged empirical and theoretical PHP of Finisterre and Santander sections (Student-t test, P > 0.1). However, significant differences were found for the  Supplementary Information, Fig. S1). The numbers indicate the stations in which leucine incorporation rate (LIR) was measured. The experiments to determine empirical leucine-to-carbon conversion factors (eCFs) were carried out at biological stations 11, 111 and 115 (highlighted with a red circle). Bacterial diversity and the quality and quantity of DOM were determined at stations 11 and 115.  Table S1). On the contrary, both H' and the estimated ASV richness (SChao 1 ) did not show a clear vertical pattern (Fig. S3). The top 37 abundant ASVs/phylotypes (relative abundance > 1%; Fig. S3), with relatively similar vertical distributions at both stations, showed different trends among them. The ASVs belonging to SAR324, SAR202, and the group JL-ETNP-F27 were almost absent within epi-and upper mesopelagic waters, increasing their relative contribution with depth (accounting together for up to 70% and 49% of the total reads in Finisterre and www.nature.com/scientificreports/ Santander in bathypelagic waters). Other ASVs/phylotypes, such as Actinobacteria, Gammaproteobacteria (such as SAR86, SUP05_2 and Gammaproteobacteria_Others) showed the opposite trend. Other phylotypes, such as SAR406 and those belonging to Alphaproteobacteria (green tones, Fig. S4) did not show a clear trend throughout the water column. Interestingly, a few groups showed opposite patterns between stations. For instance, Planctomycetes were notably abundant at 250, 500 and 2750 m in Finisterre, while this phylum was hardly ever present in mesopelagic waters but it was an important member of bathypelagic communities in Santander. Finally, both stations displayed a mean relative contribution of the group Others (composed of ASVs in very low abundance, accounting for < 1% of total number of reads even after adding them up at Phylum level), higher in bathy-and lower mesopelagic waters than in epi-and upper mesopelagic waters. In general, the composition of the bacterial community was similar for both stations (ANOSIM, r = 0.22, P = 0.21). However, significant differences arose among epi-, upper and lower-mesopelagic, and bathypelagic bacterial communities (ANOSIM, r = 0.57, P = 0.02). Among the 37 abundant ASVs/phylotypes, Actinomarina_1 and SAR202_Others were the main responsible for the dissimilarities found among communities inhabiting those different depth layers (Supplementary Information Table S2). Actinomarina_1 SAR202_Others, SAR324 (Marine group)_Others and SAR406 contributed to 40% of dissimilarity between epi-, meso-and bathypelagic communities while SAR202_1, SAR202_2 and SAR202_3 accounted for 20% of the dissimilarity between upper and lower mesopelagic waters.
Vertical variability of DOM. Overall, DOM optical indices showed slightly higher values in Santander than in Finisterre, especially in the epipelagic layer (Fig. 3). Besides, they showed greater variability in epi-and upper mesopelagic waters, while the profiles were much more uniform throughout lower meso-and bathypelagic waters. Dissolved organic carbon (DOC) and DOM optical indices, with the only exception of peak M, decreased with depth ( Fig. 3A). Both stations showed maximum values of DOC (mean ± sem: 72.7 ± 3.6 and 81.0 ± 4.1 µmol C L −1 for Finisterre and Santander, respectively) and peak T (0.76 ± 0.11 and 0.73 ± 0.03 QSU, respectively) at the epi-pelagic layer, while the minimum values (50.0 ± 0.8 µmol C L −1 for DOC and 0.31 ± 0.01 QSU for peak T in both stations) were found in bathypelagic waters ( Relationships between eCFs and hydrography, DOM and bacterial diversity. Significant bivariate correlations were found between eCFs and potential temperature (Tpot) and DOM properties (Table 3). Positive correlations were found with peak T and a254, and negative correlations with peak M, and the ratios of peak M with peak T, DOC and 254 (Table 3). On the other hand, eCFs did not significantly correlate with estimated ASVs richness (SChao 1 ) nor with H' . However, they were significantly (P ≤ 0.05) related to the centered log-ratio (CLR) transformed abundance of some specific ASVs/phylotypes (Fig. 4). For instance, for the ASVs Actinomarina_1 and Actinomarina_2, and Gimesiaceae, their CLR transformed abundances were best described by a positive quadratic model, decreasing for eCFs values < 2 but increasing for eCFs > 2. Oppositely, the CLR transformed abundance of Pla3_lineage and SAR324_MGB_1 followed a negative quadratic function of eCFs, increasing until eCF ~ 2 to then decrease. In general, these specific taxa greatly determined the observed variations in CFs (R 2 > 0.5, P ≤ 0.05) (Supplementary Information, Table S3).
Notwithstanding the limited number of samples, and after testing all possible combinations of our variables (including both abundance of specific microbial taxa and DOM composition indices), we obtained a preliminary multiple linear regression model to empirically estimate a CF from temperature and DOM humic and protein fluorescence values (R 2 = 0.96; P = 0.01; n = 8): with P = 0.05 for temperature, P = 0.01 for peak M and peak T, and P = 0.03 for the intercept.

Discussion
A higher spatial resolution of eCF values is required for an accurate estimation of PHP throughout the water column. To the best of our knowledge, there are no previous studies estimating empirical leucine-to-carbon conversion factors in bathypelagic waters (> 1000 m), and there are very few studies that have investigated their relationship with bacterial diversity 3,10 . Importantly, none of them has investigated the relationship among CFs, bacterial diversity and composition of the DOM pool. A great variability of eCF values (0.09-1.47 kg C mol Leu −1 ) was previously found at epipelagic waters across the world's open oceans 3 . The eCF values obtained in this study were comparable to those found in other coastal and epipelagic waters, such as at the oligotrophic Mediterranean Sea during summer stratification (0.29-3.25 kg C mol Leu −1 ) 24 , the Galician coast (0.14-3.55 kg C mol Leu −1 ) 8,25-27 , or along an environmental gradient in an estuarine system at the northern South China Sea (0.48-1.69 kg C mol Leu −1 ) 28 .
Overall, the eCFs measured in this study (Table 2) were generally higher than those previously reported for the same depth range (mean ± sem: 1.16 ± 0.61 kg C mol Leu −1 11 and 0.55 ± 0.12 kg C mol Leu −1 12 ). This fact might be likely related to the relatively higher availability of organic substrates in our area of study 21,[29][30][31] , especially in Finisterre 32 , where eCFs were remarkably higher than in Santander for each depth range.    www.nature.com/scientificreports/ We found the highest eCF values from surface down to 500 m in Finisterre section, which is consistent with the higher DOC concentrations measured in epipelagic waters of our study area (Fig. S4) compared to epipelagic samples from subtropical north Atlantic waters (54-79 µmol C L −1 , < 200 m) 33,34 . Indeed, high concentrations of labile DOM accumulated in the epipelagic layer during the upwelling season 18 , representing 50% of the total dissolved organic carbon susceptible of microbial utilization 35 , asserting the key contribution of dissolved organic matter (DOM) to the export of new primary production in the NW Iberian upwelling system. Eventually, this DOM excess produced during the upwelling season support both the offshore export and sinking fluxes of organic matter 34,36,37 . In fact, our results show that deeper down, at 1000 m, the eCF remained relatively high in Finisterre. By contrast, eCFs decreased considerably compared to the theoretical CF in Santander. This circumstance could be partially explained by the biogeochemical differences in water masses among Santander and Finisterre stations. At ~ 1000 m, we found the high-salinity, low oxygen signature of the Mediterranean water (Table 1), with lower DOC concentrations than the water mass immediately above (the North Atlantic Central water, 250-900 m) 15 . In the mesopelagic waters (Mediterranean Water, 1000 m depth and Labrador Sea Water, 2000 m depth) of the Finisterre section it has been shown an intense water mass mixing 22 , resulting in a vertical DOM movement, which support higher activity of bacterial communities compared to Santander 19 . Overall, at the bathypelagic waters of both sections, eCFs estimated in this study were low and similar to those reported in oligotrophic areas 11,12 and in bathypelagic waters from subtropical north Atlantic waters with similar DOC concentrations (44.07 ± 2.0 µmol C L −1 , > 2000 m) 32 .
In such a context, we must also take into account that the differences among other CFs estimated in epi-and mesopelagic waters 11,12 and our study may be influenced by methodological differences in the experimental design. We conducted the manipulation experiments by diluting samples, while Gasol et al. 11 diluted and filtered, and Baltar et al. 12 . only filtered. Thus, Gasol et al. 11 and Baltar et al. 12 , by filtering the community throughout 0.6 µm, may have left out organic matter aggregates, fact that could explain why they reported lower values than those found in our study. Then, eCFs appear to be lower when using a combination of filtration and dilution, or only filtration, than when using just dilution. This is particularly relevant in our area of study since the North Atlantic coast is strongly affected by upwelling events, particularly in Finisterre, transporting organic matter from the coast to open-ocean 38 where large aggregates might likely be abundant.
Our results have crucial implications for PHP estimation. Overall, this study has demonstrated that the systematic use of the theoretical CF (1.55 kg C mol Leu −1 ) 6 would cause an important underestimation of the PHP in epi-and upper mesopelagic waters, but a significant overestimation at the bathypelagic layer, particularly in the Santander section. This result implies the existence of a much more intense gradient of PHP throughout the water column than previously reported. More importantly, our results showed that PHP does not vary linearly with depth but its depth-dependence is best described by a quadratic function. Consequently, in epipelagic waters (where most of the previous studies were carried out) and bathypelagic waters, PHP values were found to be lower than those expected under a linear depth-dependence, while the opposite occurs at the mesopelagic layer. This curvature suggests differences in hydrographic and/or physiological constraints operating throughout the water column. In oligotrophic environments (and extensively in bathypelagic waters, with relatively low availability of organic bio-labile substrates), low eCFs are attributed to energy consumption addressed to preserve metabolic processes rather than producing bacterial biomass (low PHP), so that leucine respiration is essentially destined to maintain the cells alive 28,39 . Conversely, in epi-and mesopelagic waters, our results suggest that the quantity and quality of organic substrates does not limit bacterial biomass production. Thus, leucine incorporation could be mainly destined towards biomass production (relatively higher eCFs and PHP) in comparison to other open-ocean areas or deep waters.
Assuming that the patterns found for this region can be applied to similar areas such as those under the influence of upwelling, the differences between empirical and theoretical PHP calculated ( Supplementary Information,  Fig. S2) would greatly influence the estimated carbon fluxes mediated by heterotrophic bacterioplankton activity in the ocean. This outcome implies that our current predictions on the role of bacterial remineralization throughout the water column, and hence, on carbon fluxes between surface and the deep ocean, need to be revised. Our results might likely help to reconcile the discrepancy among the amount of carbon sinking out of the surface ocean and the biological carbon demand in the dark ocean 40,41 depending on location and depth of the study area.
Importantly, we studied the influence of community composition and DOM properties over eCF values. Overall, we did not find a correlation between eCFs and ASV richness (SChao 1 ) or the Shannon diversity index. However, those indices may not reflect whether specific microbial taxa are relevant in the degradation of marine DOM 13 . Nevertheless, our study displayed shifts in eCF values which might be partially related to several specific groups. On the one hand, Actinomarina_1, Actinomarina_2, and Gimesiaceae showed positive quadratic relationships with eCFs. Our results suggest that these groups may play a key role in determining prokaryotic activity, particularly in epipelagic waters. Overall, the higher eCFs found at epipelagic waters could be attributed to the occurrence of these abundant bacterial phylotypes stimulated by higher amounts of phytoplankton exudates and/or also higher temperature (Table 1). On the other hand, Pla3_lineage, and SUP05_Others showed high fitted relationships between their average abundance and eCFs, explaining 81% and 78% of their variability, respectively. These phylotypes followed a negative, quadratic function with eCFs, which predicts intermediate eCFs when these bacterial groups are abundant. These groups were particularly abundant in lower mesopelagic waters, which suggest that they might be involved in the degradation of relatively recalcitrant compounds, predominant in these waters. In the same way, SAR324 and SAR202 (Chloroflexi) which were the most abundant groups in both sections at lower mesopelagic and bathypelagic waters, and show strong and weak correlations with CFs, respectively, should be related to the oxidation of recalcitrant dissolved organic matter 42 and consequently lower eCFs.
For the first time, our results have also shown that eCFs are shaped not only by depth-related hydrographic features and some specific taxa, but also, to a higher extent, by DOM composition. In such a context, peak M  (Table 3). Thus, the humic-like substances (more reworked/refractory DOM generated as by-products of respiration processes 18 , i.e., less bioavailable material) were inversely associated with eCFs, producing lower eCF values when DOM is less labile (i.e., bathypelagic waters). Peak T (related to the production of biolabile DOM) and a254 (which is related to DOC 33,43 and thus considered a quantity factor) were also positive and significantly related to our eCFs, which would indicate that CF values are higher when DOM is likely more bioavailable (i.e., epipelagic waters). On the other hand, the significant correlation found between eCFs and the peak M/DOC ratio (and peak M/a254, or even peak M/peak T ratios) highlighted the importance of both, quantity (DOM concentration) and quality (fluorescence peak M) of organic compounds as controlling factors in the determination of carbon conversion factors. Taken together, it presumably implies that different DOM molecular groups and their availability in the environment may have an influence in the determination of carbon conversion factors. In this sense, the only significant multiple regression found to explain our eCFs with the physical, chemical and biological variables, linked CF values mainly to peaks M and T (DOM features) and, to a lesser extent, to temperature. It is interesting to note the opposite relation of eCFs with temperature in the multiple regression compared with the bivariate model. Both results are consistent, as these coefficients represent different processes in each correlation. The simple bivariate models represent the direct and complete relation between two variables, while multiple regressions show the correlation with each variable excluding changes due to the others (discriminating processes).
In conclusion, this study showed that empirical leucine-to-carbon conversion factors decreased with depth, showing a wide range of variability throughout the water column for two stations in north-eastern Atlantic waters, resulting in a non-linear dependency of PHP with depth. Our results imply that the use of the theoretical factor of 1.55 kg C mol Leu −1 6 in oceanic waters would lead to the underestimation of prokaryotic carbon production in epi-and upper mesopelagic waters, and to its overestimation in bathypelagic waters. In addition, for the first time, we have provided evidence of strong and significant links among eCFs, environmental variables, DOM, and bacterial community composition. An exploratory preliminary multiple regression model is provided as starting point for the estimation of conversion factors in open-ocean waters from relatively simple optical measurements and basic hydrographic observations that can be easily obtained in near real-time. This research illuminates dark-ocean biogeochemistry that is broadly consequential for reconstructing the global carbon cycle.

Methods
Sampling strategy. This study was carried out during the MODUPLAN 0814 cruise on board the RV Sarmiento de Gamboa (August 2014), visiting several stations along two perpendicular sections, off the coasts of Galicia (Finisterre section, 43°N, 9°W to 43°N, 14°W) and Cantabria (Santander section, 43°N, 3° 47′W to 45°N, 3° 47′W) (Fig. 1) in the northern Atlantic Ocean. Vertical profiles, from surface to a maximum depth of 5300 m (depending on the bathymetry of each station), as well as seawater sampling, were carried out using a CTD-ADCP-rosette system, provided with oxygen and fluorescence sensors as well as twenty-four 12-L Niskin bottles. Thus, along the cruise we performed: (1) hydrographic characterization (potential temperature, salinity and dissolved oxygen concentration) of the water column (for all the stations; Fig. 1); (2) experiments for the empirical determination of leucine-to-carbon conversion factors (at biological stations: 11, 111 and 115; Fig. 1); (3) determination of bacterial metabolism (leucine incorporation rate to estimate heterotrophic prokaryotic production, at biological stations; Fig. 1), (4) bacterial diversity (stations 11 and 115); and (5) DOM characterization (concentration of DOC and DOM optical properties, at biological stations; Fig. 1).

Experimental setup for determining eCFs.
With the aim of determining in situ factors to convert LIRs into carbon bacterial production, dilution experiments were performed at 500, 1000 and 2000 m in stations 11 (Finisterre) and 115 (Santander), and at 50 and 100 m in station 111 (Finisterre). At each station and depth, the water sample was diluted (1:10) with 0.2 µm-filtered (Acropack 1000, Pall) seawater from the same sample and incubated in 2-L polycarbonate bottles in the dark at the corresponding in situ temperature (± 1.5 °C). Subsamples were taken for estimating LIR, and biomass was determined by flow cytometry (see below) at 24-h intervals until bacteria reached the stationary growth phase, after 6-8 days since the beginning of the incubations.
Conversion factors were subsequently calculated following the cumulative method 44 , which estimates the slope of the linear regression between prokaryotic biomass (y-axis) and leucine incorporation (x-axis), accumulated at different time intervals during the time course incubations. One of the limitations of these experiments is that increases in leucine incorporation do not accurately reflect increases in bacterial biomass. This assumption is often not met because of different processes (i.e. grazer influence and/or viral lysis) 45,46 . Hence, to derive resolvable slope values not all time points have been used (i.e. certain data points, where biomass decreases, were excluded).
Prokaryotic abundance and biomass. Total prokaryotic abundance during the dilution experiments was daily determined on board by flow cytometry, following the method previously described by Gasol et al. 47 . Prokaryotic cell counts were detected by their distinct signature in a plot of side scatter vs. green fluorescence using a FACSCalibur flow cytometer (Becton Dickinson). The biovolume of prokaryotic cells was estimated using the calibration obtained by Calvo-Díaz and Morán 48 relating relative light side scatter (population SSC divided by bead SSC) to cell diameter, assuming spherical shape. Cell biovolume (BBv) was converted into carbon biomass (C; pg cell −1 ) using the allometric relationship of Norland 49  www.nature.com/scientificreports/ Vertical profiles of LIR and PHP. In situ LIRs were measured using two different methods. The centrifugation method was used for epi-and mesopelagic waters (≤ 1000 m) 15 , whilst the filtration method 15 was used for bathypelagic samples, because of their typically lower prokaryotic activity. Both methods used 3 [H]-leucine (160 Ci mmol L −1 , GE Healthcare) at a final concentration of 5 nmol L −1 15 . Incubation time and sample volume were adjusted depending on the expected prokaryotic abundance and activity. For the centrifugation method, three replicates of 1.2-mL and two TCA-killed blanks (5% final concentration) were incubated in the dark and at simulated in situ temperature (± 1.5 °C), for 2 to 6 h. The incubations were stopped by adding TCA (5% final concentration). Prokaryotic proteins were precipitated by two successive centrifugation steps (12,000 rpm, 10 min), including one 1-mL 5% TCA wash, following Kirchman et al. 50 with slight modifications 51 . For the filtration method, 40-mL samples, in duplicate, plus two formaldehyde-killed blanks (2% final concentration) were incubated in the dark at in situ temperature for 6 to 24 h. Subsequently, the incubations were stopped by adding formaldehyde (2% final concentration), filtered through 0.2-µm polycarbonate filters (25 mm of diameter, Millipore), and rinsed twice with 10-mL of 5% ice-cold TCA. Finally, the filters were air dried and transferred to scintillation vials. For both centrifugated and filtered samples, radioactivity was measured in a scintillation counter (Perkin-Elmer TriCarb 3100TR) after at least 18 h since the addition of the scintillation cocktail (Ultima Gold XR). The disintegrations per minute (DPMs) of the blanks were subtracted from the mean DPMs of the respective samples, and the resultant DPMs were converted into LIRs 51 .
From LIR estimates, PHP (μmol C m −3 d −1 ) was calculated as PHP = LIR * CF, where CF is the leucine-tocarbon conversion factor expressed in kg C mol Leu −1 . The theoretical PHP was determined by applying the theoretical CF proposed by Simon and Azam 6 , 1.55 kg C mol Leu −1 , while the empirical PHP was determined by applying the in situ eCFs obtained in this study.
DNA extraction, amplification, sequencing, and bioinformatics. Seawater samples for DNA analyses were collected at each sampling depth by filtering 10-15 L through 0.22-µm Sterivex filters (Millipore). Then, 1.8 mL of lysis buffer (40 Mm EDTA, 50 mMTRIS-HCl, 0.75 M saccharose) was added to the cartridge filter and they were stored at − 80 °C until further analysis. The DNA extraction was performed following the phenol-chloroform extraction method described by Massana et al. 52 with slight modifications 15 . Cell lysis was performed by a 45-min digestion with freshly-made lysozyme (1 mg mL −1 final concentration) at 37 °C, followed by a 60-min proteinase K digestion (0.2 mg mL −1 final concentration) with sodium dodecyl sulfate (SDS) (10%) at 55 °C. Then, DNA was extracted twice in phenol:chloroform:IAA (25:24:1) and once in chloroform:IAA (24:1). The extracted DNA was concentrated using an Amicon Ultracel 100 k filter unit (Millipore). DNA concentration and purity were quantified according to the A260/A280 ratio using a Nanodrop spectrophotometer (Thermo Scientific, EEUU). Nucleic acid extracts were stored at − 20 °C until further analysis.
DNA was analyzed in an Illumina Miseq platform using 2 × 250 bp paired-end approaches. From raw sequence data, primers and spurious sequences were trimmed using cutadapt trimming ~ 50 bp. Exact ASVs were differentiated by using dada2 54 implemented in R 55 . The approach is threshold free, inferring exact variants up to one nucleotide of difference using the Q scores in a probability model. This pipeline was implemented through the high-performance supercomputing resources belonging to the Centro Tecnolóxico de Supercomputación de Galicia (CESGA). Sequences were aligned against SILVA 132 16S rRNA database 56 as reference. Finally, singletons (ASVs found only once in the final ASV table) were excluded, as they have been shown to be likely the result of PCR or sequencing errors 57 . The number of reads per sample ranged from 6,513 to 31,282 in Finisterre and from 9,674 to 25,499 in Santander, with a total of 213,576 reads. The dataset was thus rarefied to the lowest number of reads per sample (6,513 reads) to enable diversity comparisons among samples. ASV richness and diversity metrics were determined implementing the function estimateR (vegan package, 58 ) in R 55 .
DOC concentration and DOM optical properties. All DOM samples above 200 m were filtered under positive pressure of nitrogen using an acid-clean all-glass system and combusted (450 °C) GFF filters. Water samples for DOC analysis were collected in combusted (450 °C) glass ampoules, and acidified with H 3 PO 4 to pH < 2. The ampoules were heat-sealed and DOC concentrations were determined with a Shimadzu TOC-V CSH analyzer by high-temperature Pt-catalytic oxidation 59 . Samples were calibrated daily with potassium hydrogen phthalate (99.95-100.05%, p.a., Merck) and the precision of the measurements was 1 µmol C L −1 . The accuracy of the system was checked with the reference samples supplied by D. A. Hansell (University of Miami, USA).
Statistical analysis. The normality of the variables was tested with the Shapiro-Wilk test 61 . Then, Pearson correlation test 62 , performed in XLSTAT 63 , was used to determine the bivariate correlation between eCFs and the hydrographic features, the DOM optical properties and bacterial diversity. Multiple linear regression models were adjusted using the package STATISTICA by StatSoft. In order to study the variability of PHP with depth, linear and second order polynomial regression models were fitted to data. The best fitted model was selected according to the lowest value of the Akaike's Information Criterion (AIC) 64 . Then, one-way analysis of variance (ANOVA) was applied in order to test for significant differences between mean empirical and theoretical PHP. A Tukey's post hoc test was used to determine which depth layers (epipelagic, upper and lower mesopelagic, and bathypelagic) were significantly different from each other.
To compare bacterial community composition among depth layers, an analysis of similarity (ANOSIM), based on Bray-Curtis dissimilarity, was implemented. Then, a SIMPER analysis determined the main ASVs responsible for the Bray-Curtis dissimilarity between each pair of groups 65 . Both analyses were implemented with the vegan package 58 in R 55 .
All "count zeros" were replaced in the microbial absolute abundance matrix by the Bayesian multiplicative method (function cmultRepl in the zCompositions package in R), according to Quinn et al. 66 . Then, centered log-ratio (CLR) transformation of abundances was performed through the function clr (MASS package). Finally, linear and second order polynomial regression models were fitted to the relationship between eCFs and the averaged (all depths within each depth layer) (CLR) transformed abundance of ASVs/phylotypes. Again, the best fitted model was selected according to the lower AIC value.
All the statistical analysis were performed in R 55 unless otherwise specified.