Introduction

The geochemistry of foraminiferal tests from marine sediment is utilized extensively as a tool to infer paleoceanographic variability on timescales ranging from decades to millennia, thereby playing an integral role in our understanding of climate change1,2,3,4,5,6,7. In reconstructing geochemically derived estimates of paleoceanographic parameters, attention must be paid to the ecology and taxonomy of the foraminifera selected for analysis. Inaccurate identification of species could potentially bias or distort reconstructions and add an unknown dimension of uncertainty to quantitative estimates of paleoceanographic parameters8,9.

Planktic foraminifer Globigerinoides ruber (G. ruber) is perhaps one of the most widely used species for reconstructing past sea-surface conditions1,2,3,4,5,6,7. Globigernoides ruber is ubiquitous in the mixed layer of tropical/subtropical waters and is known to live throughout the year10,11. Thus, its geochemistry is an attractive proxy for past sea-surface temperature (SST) and δ18O of seawater (δ18Osw).

Apart from its pink chromotype, G. ruber (P), multiple morphotypical variants of its white variety, G. ruber (W), have been identified and described in micropalentological literature. These include Globigerinoides elongatus12, Globigerinoides pyramidalis13, Globigerinoides cyclostomus14 and the holotypic normalform of G. ruber (W), first described as Globigerina rubra15. More recently, stable isotopic and trace metal geochemistry studies have placed the former three variants under G. ruber sensu lato (sl) while the latter has been termed G. ruber sensu stricto (ss), albeit acknowledging a large range of transitional forms between the morphotypes16,17,18,19,20,21 (Fig. 1). These studies compared the stable isotopic oxygen and carbon composition of the two morphotypes in core-tops, downcore sediments and sediment traps from the South China Sea, Indo-Pacific and Japanese seas16,17,18,19,20,21.

Figure 1
figure 1

Scanning Electron Micrographs of Globigerinoides ruber (White) morphotypes.

(1) a and b: G. ruber (W) sensu lato; (2) c and d: G. ruber (W) sensu stricto.

Understanding the ecology of modern G. ruber (W) sets up the expectation for interpreting signals derived from downcore geochemical variations. Previous studies based on core-tops and downcore samples, despite small sample numbers (5–23 pairs), inferred that differences in stable isotopes were the result of either distinct calcifying depth habitats (ss: ~0–25 m, sl: ~25–50 m), seasonal preferences (sl: winter-biased), or vital effects between G. ruber (W) ss and sl and concluded that the sl morphotype was a cold-biased specimen16,17,18,19. These results have critical implications for paleoceanographic reconstructions using a non-selective mixture of the two morphotypes, as they can be biased or distorted due to the averaging of signals from different depths or seasons. The sediment trap studies, where the age of the samples are known with weekly/monthly precision, found little-to-no differences in the stable isotopes20,21. Hence it is important to study both the modern ecology and geochemistry of these morphotypes using large sample numbers and different sampling archives in order to quantify the degree of bias that may occur due to morphotypical variability in G. ruber (W).

In this work, we study the stable isotopic differences between coeval ss and sl from core-tops, late Holocene downcore samples and a sediment trap in the northern Gulf of Mexico. As a geochemical test of the null hypothesis, we also analyze mixed G. ruber (W) couplets with morphologies intermediate to the ss and sl holotypes (hereafter ‘intermediate’ G. ruber (W) tests) to systematically investigate the composition of different G. ruber (W) subsets: if the geochemical composition of all subsets are comparable, we fail to reject the null hypothesis that morphotypical variability has no effect on G. ruber (W) geochemistry; if they are consistently different, then we reject the null hypothesis and conclude that morphotypical variability has a significant effect on G. ruber (W) geochemistry. We couple these results with INFAUNAL, a recently published foraminiferal statistical model22, to gain insight into the use of the two morphotypes as paleoceanographic recorders in the Gulf of Mexico.

Results

We report here 130 δ18O and δ13C measurements on 37 pairs of G. ruber (W) ss and sl along with 28 pairs of intermediate G. ruber (W) tests from three sampling archives in the northern Gulf of Mexico ( Supplementary Fig. 1 ): core-tops, downcore samples and a sediment trap (Fig. 2 and 3). Table 1 lists the mean and standard deviation of ss-sl measurements in each archive. Here, the mean and standard deviation reflect the overall (time-dependent) variability in our selected samples of each archive; for example, sediment trap standard deviation is high due to the greater variance of annual temperature/salinity in the sampling interval. This variability notwithstanding, the mean and standard deviation of the ss morphotypes are similar to the sl morphotypes. To statistically test these comparisons, we chose to perform Welch's t test23 (paired t test with unknown and unequal variance) on the ss-sl pairs with no a priori assumptions about the variance of the underlying populations24. All populations were found to be normal based on a Shapiro-Wilk test25, except for downcore δ13C of ss and sl, where we used the non-parametric Mann–Whitney–Wilcoxon ranksum test26. From this exercise, we failed to reject the null hypothesis for ss-sl pairs across all archives, that is, the δ18O and δ13C difference between ss and sl morphotypes is not statistically significant (p<0.05; Table 1). We also pooled all the ss-sl pairs across the different sampling archives and tested for regressions using the maximum likelihood estimate method27 incorporating bivariate analytical uncertainty28, where 1σanalytical = 0.08‰ in δ18O and 0.06‰ in δ13C. Within uncertainty, both δ18O and δ13C slopes and intercepts were not significantly different (p<0.05) from the 1:1 line, where the slope is unity and intercept is zero (Fig. 3).

Table 1 Mean and Standard Deviation of ss-sl Isotopic Measurements in Each Sampling Archive with Outcomes of Welch's t test at p<0.05 level, where H = Ho implies null hypothesis cannot be rejected; H = Ha implies null hypothesis can be rejected
Figure 2
figure 2

Stable Isotopic Results from the Sediment Trap.

δ18O (a) and δ13C (b) of G. ruber (W) sl (blue circles) and ss (red circles) morphotypes along with δ18O (c) and δ13C (d) of intermediate G. ruber (W) morphotypes (gray squares) reported relative to VPDB (‰) with error bars based on analytical precision (±1σ; δ18O – 0.08‰ and δ13C – 0.06‰) over 2009–2013. Sea-surface temperature (SST) from HadISST40 (orange line) and nearby NDBC Buoy SST (27.795°N, 90.648°W – Green Canyon; green dashed line) over the same time period are plotted in (a) and (c), scaled according to the δ18O axis, based on the slope from Bemis et al., 199841. Correlation coefficients are calculated with buoy SSTs when available and HadISST-based SSTs when the former are unavailable.

Figure 3
figure 3

Regression Analysis of ss-sl Samples Across Three Sampling Archives.

G. ruber (W) δ13C (a) and δ18O (b) results for ss (abscissa) versus sl (ordinate) from the sediment trap (yellow triangles), core-top (orange squares) and downcore samples (green circles) with error bars based on analytical precision (±1σanalytical; δ13C – 0.06‰ and δ18O – 0.08‰). The 1:1 line (black dashed line) along with uncertainty limits (grey dashed lines based on ±1σanalytical) is also plotted. The maximum likelihood regression lines incorporating bivariate uncertainty are: 1) sl-δ13C = (−0.11 ± 0.06) + (0.96 ± 0.06)*ss-δ13C and 2) sl-δ18O = (−0.01 ± 0.05) + (0.96 ± 0.04)*ss-δ18O.

We use offsets between coeval samples as a metric to statistically compare the ss-sl measurements with the intermediate couplets. The absolute offsets between coeval ss-sl samples ranged from 0–0.52‰ in δ13C and 0–0.56‰ in δ18O, while the intermediate couplets ranged from 0–0.50‰ in δ13C and 0.01–0.53‰ in δ18O (See Supplementary Table S1 for the range in each archive). The mean ss-sl offset and standard deviations in all archives are not systematic and closely cluster around zero. We tested for mean ss-sl offsets significantly different from zero using a Student's t test in a Monte Carlo framework (n = 5000) to account for analytical error at the ~95% confidence level (i.e. ± 2σanalytical) incorporated as a Gaussian distribution. All archives failed the test with probabilities ≥70% that both carbon and oxygen stable isotopic offsets were not significantly distinct from zero thus corroborating our initial Welch t test outcomes and regression analysis that ss-sl couplets have statistically similar variability in stable isotopic composition.

As the intermediate G. ruber (W) pairs are interchangeable amongst coeval couplets (their transitional form inhibits selective categorization), we generated all possible combinations of the couplets in each archive using binomial expansion (See Methods for details). Next, we computed the mean (μc) and standard deviation (σc) of the offsets for each combination. To gain insight into the variability of the intermediate couplets, we report the average mean offset of all the combinations with its associated standard deviation (<μc> ± σμ; (b1) and (d1) in Table 2) and the average standard deviation of all the offsets with its associated standard deviation (<σc> ± σσ; (b2) and (d2) in Table 2) in the binomially generated combinations. We note that the average standard deviation of the intermediate G. ruber (W) offset (<σc>), a measure of non-morphotypical variability, is statistically similar to the corresponding mean and standard deviation of the ss-sl offset within analytical error (p<0.05; Table 2).

Table 2 Mean and Standard Deviation (1σ) of Offsets Between Coeval ss-sl Samples (a and c) and Corresponding Mean (b1 and d1), Standard Deviation (b2 and d2) and their Standard Deviation for all Combinations of Intermediate G. ruber (W) Couplets

Apart from stable isotope analysis, the sediment trap allows us to quantify the monthly abundance of each morphotype. Over our sampling interval, in general, we observe the sl morphotypes to be more abundant than the other morphotypes. Concerning seasonal preferences, the census data indicate that neither ss, sl, nor the intermediate G. ruber (W) specimens prefer any particular season in our sampling window (Fig. 4). Moreover, we found no persistent season where one morphotype dominates over the others.

Figure 4
figure 4

Year-Normalized Flux.

Sediment trap-based year-normalized flux (%) measurements for G. ruber (W) sl (blue), ss (red) and intermediate (grey) morphotypes. Persistent seasonal preferences or abundance of one morphotype over another are not observed. Box containing hatched lines indicates a gap in data collection.

Ruling out a seasonal bias in morphotype, we investigated whether the ss and sl morphotypes had preferential calcifying depth habitats as suggested in previous studies: ss preferring 0–25 m and sl preferring 25–50 m. To assess the potential for resolving depth-specific signals in marine sediment, we applied INFAUNAL22 at surface and subsurface depths in the Gulf of Mexico to construct idealized virtual sediment samples for the two habitats. We constructed two 50-year-long pseudo-δ18Ocarbonate time series using monthly temperature and salinity from the ECMWF ORA-S4 reanalysis dataset29 at depths of 5 m (ss) and 55 m (sl; see Supplementary Fig. 2 ). These depths were chosen based on those available in the ORA-S4 dataset that were closest to the extremes of the previously hypothesized calcification depths (we also performed the experiment using 35 m and 45 m depths; see Methods). Next, we performed bootstrap Monte Carlo picking experiments (n = 10000) on these virtual sediment samples with 50 pseudo-foraminifera to determine whether the offset produced in INFAUNAL would be comparable to the offset observed in the ss-sl data. We chose 50-year-long time series and 50 pseudo-foraminifera for the experiment based on approximately equivalent sample resolution and number of foraminifera analyzed in the core-top and downcore samples (the high temporal resolution and abundance of the sediment trap samples does not allow for a one-to-one downcore analog). The INFAUNAL results indicated that 50 foraminifera picked from the 5 m pseudo-δ18Ocarbonate time series and 50 from the 55 m time series could resolve these depth-specific signals with a high probability (≥90%) and that the offset between the picked means of the idealized time series was significantly distinct from zero (p<0.001). However, this idealized population of offsets is significantly different than the ss-sl offset observed in the core-top and downcore δ18O data (p<0.001), the latter of which is not distinct from zero (p<0.05; Fig. 5).

Figure 5
figure 5

Data-Model Comparison of Simulated Offsets with Uncertainty Constraints.

Monte-Carlo-based histogram of mean offsets from the ss-sl data (green) in core-top/downcore samples compared to a histogram of mean offsets between pseudo-δ18O time series from 5 m and 55 m depth generated using INFAUNAL22 (orange). Both populations incorporate analytical and sampling uncertainty and are significantly different from each other (p<0.001). Note that the model-offset population is significantly distinct from zero (p<0.001) while the data-offset population is not different from zero (p<0.05).

Discussion

Our observations and statistical tests indicate that there are neither significant nor systematic stable isotopic differences between G. ruber (W) ss and sl populations across three different sampling archives in the Gulf of Mexico (Fig. 3). The variability in δ13C and δ18O of both morphotypes is statistically indistinguishable (Table 1). The intermediate G. ruber (W) samples display very similar variability and contain intra-sample variability comparable to the ss-sl populations (Table 2), thereby preventing us from rejecting the geochemical test of the null hypothesis. Taken together, our observations imply that morphotypical variability in G. ruber (W) has little if any control on its δ13C and δ18O composition.

Though the δ13C variability in the sediment trap samples is seemingly chaotic, the δ18O variability is distinctly controlled by climate. Despite steep rates of change in SST during boreal spring and fall at the northern Gulf of Mexico (~10°C seasonal cycle), the δ18O of both morphotypes in the sediment trap samples reliably capture SSTs (Fig. 2). The same is true for the intermediate couplets. In examining coeval offsets, the δ18O standard deviation is reduced by ~50% compared to the overall standard deviation of each morphotype (~0.6‰ vs. ~0.2‰; Tables 1 and 2), whereas the overall δ13C standard deviations are similar to that of the offset (~0.2‰ vs. 0.2‰; Tables 1 and 2). This implies that intra-morphotype δ13C is as variable as inter-morphotype δ13C, a result that is in line with previous studies highlighting the complex controls on stable isotopic carbon in foraminifera30,31. We observe similar variability in the intermediate G. ruber (W) couplets (±0.23‰), supporting this interpretation.

What are the ecological implications of our observations concerning the habitat of G. ruber (W) morphotypes and its effect on paleoceanographic reconstructions? From our sediment trap flux data, we find no evidence that ss, sl, or the intermediate G. ruber (W) samples prefer one season over another (Fig. 4). The data also indicate that no particular morphotype is persistently more abundant than any other morphotype. This supports the inference that all morphotypes of G. ruber (W) live throughout the year and that paleoceanographic records generated using the species should be representative of annual conditions, substantiating previous studies in the Gulf of Mexico32,33 and elsewhere34. Using INFAUNAL, we show that pseudo-foraminifera calcifying exclusively at 5 m and 55 m can resolve depth-specific δ18O (temperature and salinity) signals with a very high probability (≥90% with 50 specimens in 50-year sample resolution and ≥70% at 45 m; See Methods and Supplementary Fig. 3 ). We also show that the resulting distribution of pseudo-foraminifera is significantly different from our core-top and downcore data, which is centered on zero (Fig. 5). This result implies that ss and sl morphotypes must dwell and migrate to similar depths in the Gulf of Mexico. We infer that these depths are restricted to the upper portion of the mixed layer due to the excellent correlation between both morphotypes and SST in the sediment trap samples (rss = 0.92, rsl = 0.86; Fig. 2). Thus, our observations and modeling results unequivocally indicate that G. ruber (W)-based paleoceanographic records, regardless of morphotype, reflect annual surface water conditions in the Gulf of Mexico.

Contrary to previous G. ruber (W)-morphotype studies based on core-tops and downcore samples in the South China Sea and Japanese seas16,17,18,19, our findings suggest no morphotype-based biases in utilizing a non-selective mixture of G. ruber (W) ss and sl for paleoceanographic reconstructions. Nevertheless, our findings are corroborated by studies utilizing sediment traps in the Indo-Pacific seas20 and plankton tow samples around Japan35, where no geochemical and flux differences were observed in the former and sea-surface maxima in abundance for both morphotypes were observed in the plankton tows. We feel that this inconsistency may arise either due to the influence of the large latitudinal extent of sample selection in the previous studies resulting in dissimilar seasonal cycles across all the sampling locations, loose temporal constraints on core-tops and/or possibly from limited sample numbers, few specimens analyzed per sample and a non-rigorous treatment of uncertainty. For example, we note that the core-top samples from the South China Sea in an early study16 are obtained from a large latitudinal transect spanning from 6°N-22°N where the seasonal cycle changes from a tropical (smaller seasonal cycle) to sub-tropical (larger seasonal cycle) setting. The thermocline and other oceanographic features are variable over this latitudinal range as well36,37. Quantifying these multiple sources of uncertainty in the South China/Japanese Sea and focusing solely on ss-sl-based isotopic variability is non-trivial and outside the scope of this work, thereby limiting a one-to-one comparison with our results. Further, the size fraction of specimens used in our study (212–300 µm) is smaller than earlier studies (315–400 µm16, 250–350 µm18) adding another barrier in directly comparing these studies, as size and ontogeny can significantly affect δ13C and δ18O variability38,39. Though preliminary sediment trap work in the South China Sea region is equivocal about the two morphotypes21, a more comprehensive spatially-invariant sediment trap/plankton tow study would certainly assist in interpreting the earlier core-top/downcore studies16,17,18,19.

In summary, we demonstrate the advantage and application of using a comprehensive dataset in tandem with a forward modeling statistical approach to glean insights into ecological variability. Such data-model comparisons characterized with robust uncertainty constraints are useful in discerning the effect of ecological parameters on paleoceanographic reconstructions. In this study, we show that all lines of evidence (observations, null hypothesis testing and data-model comparisons) indicate that G. ruber (W) ss, sl and intermediate morphotypes live throughout the year and dwell in the upper portion of the mixed layer in the Gulf of Mexico. Hence, downcore reconstructions using non-selective mixtures of G. ruber (W) specimens should reflect annual surface water conditions.

Methods

Specimen Selection

We chose G. ruber (W) sensu stricto (ss) and sensu lato (sl) specimens in the 212–300 µm size fraction for all sampling archives. Sensu lato (Fig. 1 a and b) was characterized as a kummerform having three compressed spherical chambers in the final whorl where the final chamber was smaller and flattened compared to the others, forming a moderate to high trochospiral form, with a rounded primary aperture situated asymmetrically over the previous suture. Sensu stricto (Fig. 1 c and d) was characterized as having three spherical chambers in the final whorl that progressively increased in size and had a moderate trochospire shape, with radial sutures containing supplementary apertures and a primary aperture that was wide and more arched than the sl morphotype, symmetric over the previous suture. Intermediate specimens include tests with morphotypical variability transitional to that between ss and sl (for example, a normalform containing compressed chambers in the final whorl and a wide, highly-arched primary aperture that sat asymmetrically over the previous suture or a normalform containing a narrow, rounded, primary aperture sitting symmetrically over the previous suture, with depressed radial sutures with ancillary suture apertures).

Stable Isotope Analysis

We selected 6–20 specimens of each G. ruber (W) morphotype from the sediment trap samples and ≥50 specimens for the downcore/core-top samples. Specimens were crushed and homogenized and cleaned with methanol before geochemical analysis. Stable isotopes were measured using a Thermo-Finnigan MAT 253TM isotope ratio mass spectrometer coupled to a Kiel IV Carbonate Device housed in the Analytical Laboratory for Paleoclimate Studies (ALPS) at the Jackson School of Geosciences, University of Texas at Austin. The 1σ precision of the stable isotopic measurements in this study based on multiple analyses of an in-house carbonate standard (n = 28) is 0.03‰ for δ13C and 0.06‰ for δ18O, consistent with the long-term precision for this instrumental setup (0.06‰ for δ13C and 0.08‰ for δ18O). All stable isotope values are reported relative to Vienna Pee Dee Belemnite (VPDB) in standard notation.

Year-normalized Flux

We calculated year-normalized flux (Fig. 4) from the sediment trap using:

Binomial Expansion for Intermediate Couplet Combinations

While computing the mean and standard deviation of the offsets between coeval intermediate pairs for an archive, we considered all possible combinations by interchanging samples of the intermediate couplets. In offset space, this effectively reduces to a change in sign before computing the mean and standard deviation of all the offsets in an archive, thereby following a binary ‘on-off’ pattern. The number of unique combinations n possible for a given number of samples s in an archive is obtained by binomial expansion:

After generating n combinations, we computed the mean (μc) and standard deviation (σc) of the offsets for each combination. We report the average mean offset of all the combinations with its associated standard deviation (<μc> ± σμ; (b1) and (d1) in Table 2) and the average standard deviation of all the offsets with its associated standard deviation (<σc> ± σσ; (b2) and (d2) in Table 2) in the binomially generated combinations to compare the variability of offsets in the intermediate couplets and compare them to the ss-sl offsets.

INFAUNAL model

Bootstrap Monte Carlo simulations (n = 10000) were performed to generate a population of means that incorporated analytical uncertainty (±2σ) and sampling uncertainty involved with utilizing 50 pseudo-foraminifera in a virtual sediment sample representing 50 years using the Individual Foraminiferal Approach Uncertainty Analysis (INFAUNAL) model for multi-test foraminiferal analysis as described by Thirumalai et al. (2013)22. We applied the algorithm to perform picking experiments on a δ18O time series generated from temperature and salinity data at depths of 5 m and 55 m using the ECMWF ORA-S4 ocean reanalysis dataset29 with data extracted from 26.7°N, 93.9°W (the location of our core-top and downcore samples) in the Gulf of Mexico. 5 and 55 m depths were chosen from the reanalysis dataset because they were the closest to the extremes of the previously hypothesized calcification depths (0–25 m for ss and 25–50 m for sl). To ensure the robustness of these results, we also performed the same INFAUNAL picking experiments at 35 m and 45 m ( Supplementary Fig. 3 ). Similar to the resulting offsets between 5 and 55 m, we observed that there was a high probability (≥70%) that pseudo-foraminifera calcifying at 5 m versus 45 m can resolve depth-specific δ18O signals. The probability of resolving depth-specific signals using idealized pseudo-foraminifera became lower at 35 m (≥25%), limiting our ability to test hypothesis of selective ss-sl calcification depths using a model-data comparison. However, all offsets produced by INFAUNAL between 5 and 35 m, 45 m and 55 m are still significantly distinct from zero (p<0.001) and from the δ18O data (p<0.001), the latter of which is not significantly different from zero (we also tested this at 100 and 5000 Monte Carlo simulations and obtained the same outcome). This indicates that it is statistically unlikely that most ss and sl specimens are calcifying deeper than 35 m. Furthermore, since the mixed layer at the sediment trap site extends well beyond 55 m for most months of the year ( Supplementary Fig. 4 ), our results hold that both G. ruber (W) morphotypes in the northern Gulf of Mexico calcify in the upper portion of the mixed layer.