Materials for

Marine microbial communities rely on dissolved organic phosphorus (DOP) remineralisation to meet phosphorus (P) requirements. We extensively surveyed the genomic and metagenomic distribution of genes directing phosphonate biosynthesis, substrate-specific catabolism of 2-aminoethylphosphonate (2-AEP, the most abundant phosphonate in the marine environment), and broad-specificity catabolism of phosphonates by the C-P lyase (including methylphosphonate, a major source of methane). We developed comprehensive enzyme databases by curating publicly available sequences and then screened metagenomes from TARA Oceans and Munida Microbial Observatory Time Series (MOTS) to assess spatial and seasonal variation in phosphonate metabolism pathways. Phosphonate cycling genes were encoded in diverse gene clusters by 35 marine bacterial and archaeal classes. More than 65% of marine phosphonate cycling genes mapped to Proteobacteria with production demonstrating wider taxonomic diversity than catabolism. Hydrolysis of 2-AEP was the dominant phosphonate catabolism strategy, enabling microbes to assimilate carbon and nitrogen alongside P. Genes for broad-specificity catabolism by the C-P lyase were far less widespread, though enriched in the extremely P-deplete environment of the Mediterranean Sea. Phosphonate cycling genes were abundant in marine metagenomes, particularly from the mesopelagic zone and winter sampling dates. Disparity between prevalence of substrate-specific and broad-specificity catabolism may be due to higher resource expenditure from the cell to build and retain the C-P lyase. This study is the most comprehensive metagenomic survey of marine microbial phosphonate cycling to date and provides curated databases for 14 genes involved in phosphonate cycling.

The PDF file includes:

Supplementary Text Figs. S1 to S18 Legends for data S1 to S3 References
Other Supplementary Material for this manuscript includes the following:

Data S1 to S3
Supplementary Text A step-by-step procedure for calculating the Hill numbers Any diversity estimates are affected by sample sizes (or sampling efforts) and the abundance in each sample.This is especially problematic when using traditional diversity indices, such as species richness and Shannon Entropy.These indices may underestimate diversity from samples with low abundance, hindering the diversity comparison across spatial and temporal gradient.Therefore, we used a resampling approach to rarefy or extrapolate the Hill number to a fixed sample coverage (or a fixed sampling effort) (22).In plain language, we first resampled the otoliths from one individual to 2 times the total number of individuals in a 10-ka moving window to construct a Hill number accumulation curve.As shown in fig.S1, the Hill numbers (or effective number of taxa, y-axis) increase with the number of resampled otoliths (x-axis).Here, each curve represents a 10-ka moving window.The yellow line is the rarefied part of the curve, the purple line is the extrapolated part, and the symbol is the observed Hill number.Next, we calculate the sample coverage for each resampled number of otoliths based on the proportion of singleton and doubleton species (i.e., species represented by one or two otoliths) (22).In fig.S2, the sample coverage on the y-axis is the ratio between the number of otoliths represented by the species in a 10-ka moving window and the actual number of individuals in an assemblage (i.e., the ratio between estimated and true diversity).Even if we fixed the number of otoliths on the x-axis to standardize the diversity (i.e., Hill numbers), for example based on 50 randomly selected otoliths on the x-axis, the diversity estimates remain varied as shown on the yaxis based on the sample coverages (or sampling efforts).Therefore, standardization of all Hill number estimates to a fixed sample coverage (i.e., 85%, vertical dashed line) to fairly compare the diversity in each moving window is necessary (fig.S3).This process was repeated 1000 times to generate mean Hill number accumulation curves in figs.S1-S3.This step-by-step procedure for calculating the Hill numbers has demonstrated that we tackled the sample size/sampling effort issues in estimating diversity, which is always challenging.To our knowledge, rarefaction and extrapolation with Hill numbers ( 23) is currently the best tool for unbiased diversity estimates.It was developed to tackle the uneven sample size issue and has become the standard protocol for diversity estimates in ecological studies.

Fig. S1 .
Fig. S1.Hill number (effective number of taxa) accumulation curve based on resampling the number of individuals (number of otoliths) from the ODP Hole 1115B.

Fig. S2 .
Fig. S2.Relationship between sample coverage and number of individuals (number of otoliths) from the ODP Hole 1115B.

Fig. S3 .
Fig. S3.Coverage-based diversity accumulation curves based on Hill numbers of order = 0 ( 0 D, species richness), 1 ( 1 D, Shannon diversity), and 2 ( 2 D, Simpson diversity).The yellow lines indicate the rarefied (or interpolated) sample coverage, and the purple lines indicate extrapolated accumulation curves based on 1000 permutations.Closed circles indicate the observed Hill numbers.Because not all accumulation curves have reached their respective asymptote, we choose the sample coverage = 85% to standardize the Hill number estimates.Note that ~7.2% of moving windows has lower sample coverage (< 85%), and the lowest maximum sample coverage among moving windows is 62.8%.In other words, the Hill numbers in ~7.2% of moving windows could be underestimated, but the Hill numbers in most moving windows are standardized to equal sample coverage (or sampling efforts) of 85%.

Fig. S4 .
Fig. S4.Raw otolith abundance (A), linear sedimentation rate (LSR, B), sediment bulk density (C), and otolith accumulation rate (OAR, D) over the past 460 ka.Hatched rectangles indicate interglacial periods (Marine Isotope Stages 1, 5, 7, 9, 11 on the upper x-axis).The red circles in panel (B) indicate the original LSR data and the black line indicates the averaged value between any two red circles.The black horizontal lines in panel (D) indicate the mean and the shaded areas show the 95% confidence intervals, and the blue line (p < 0.05) and shaded areas show generalized additive model (GAM) fits on smoothed OAR using a moving-window approach with 95% confidence intervals.

Fig. S6 .
Fig. S6.Hill numbers ( q D) of order q = 0 ( 0 D), q =1 ( 1 D) and q = 2 ( 2 D) based on 85% sample coverage against (A) otolith accumulation rate and (B) benthic δ 18 O.Only significant trend lines are shown (p < 0.05).Open symbols = glacial periods and closed symbols = interglacial periods.Otolith diversity decreases with otolith accumulation rate, but increases with benthic δ 18 O.The blue line and shaded areas show generalized additive model (GAM) fits with 95% confidence intervals.The decline of Hill numbers and the explanatory power (adjust-R 2 ) of GAM are the greatest for 2 D, followed by 1 D and 0 D, suggesting that OAR exerted stronger control on the diversity of dominant ( 2 D) and abundant ( 1 D) species (i.e., myctophids) (A).Moreover, the slopes of GAM fit between diversity and δ 18 O decrease from 2 D, 1 D to 0 D, suggesting that glacialinterglacial change had less impact on the dominant ( 2 D) and abundant ( 1 D) species (i.e., myctophids) (B).

Fig. S7 .
Fig. S7.Sand accumulation rate (SAR, A) of ODP Hole 1115B and power spectrum for SAR (B).SAR is calculated using a 10-ka moving window (Materials and Methods) and the blue dashed line indicates the 90% confidence line (60).

Fig. S10 .
Fig. S10.Sea surface/upper thermocline temperature as a function of benthic δ 18 O.Only significant trend lines are shown (p < 0.05).Sea surface temperature and upper thermocline temperature data are from core MD05-2925 (Materials and Methods).Yellow symbols = data from glacial periods and purple symbols = data from interglacial periods.

Fig. S11 .
Fig. S11.Response curves from the generalized additive models (GAM) based on otolith accumulation rate (OAR) and Hill numbers of order = 0 ( 0 D, species richness), 1 ( 1 D, Shannon diversity), and 2 ( 2 D, Simpson diversity).Vertical panel A shows the GAM response curves (black lines (p < 0.05) with gray shades as 95% confidence intervals) as functions of time (left) and the mean sea surface temperature (SST, right) in 10-ka moving windows.Vertical panel B shows the response curves as functions of time (left) and the mean upper thermocline temperature (UTT, right) in 10-ka moving windows.Symbols indicate the partial residuals of the GAM fits.Gray rectangles on the left sub-panels indicate interglacial periods (Marine Isotope Stages 1, 5, 7, 9, 11 on the upper x-axis).The open symbols on the right sub-panels indicate glacial periods, while closed symbols show interglacial periods.The rug plots indicate data density.