Sixty-one thousand recent planktonic foraminifera from the Atlantic Ocean

Marine microfossils record the environmental, ecological, and evolutionary dynamics of past oceans in temporally expanded sedimentary archives. Rapid imaging approaches provide a means of exploiting the primary advantage of this archive, the vast number of fossils, for evolution and ecology. Here we provide the first large scale image and 2D and 3D shape dataset of modern planktonic foraminifera, a major microfossil group, from 34 Atlantic Ocean sediment samples. Information on more than 124,000 objects is provided, including general object classification for 4/5ths of the dataset (~ 99,000 objects). Of the ~ 99,000 classifications provided, more than 61,000 are complete or damaged planktonic foraminifera. Objects also include benthic foraminifera, ostracods, pteropods, spicules, and planktonic foraminifera test fragments, among others. This dataset is the first major microfossil output of a new high-throughput imaging method (AutoMorph) developed to extract 2D and 3D data from photographic images of fossils. Our sample preparation and imaging techniques are described in detail. The data provided here comprises the most extensive publically available archive of planktonic foraminiferal morphology and morphological variation to date.


Background & Summary
Paleontology and evolutionary biology are in the midst of a revolution driven by the proliferation of three-dimensional imaging technologies 1,2 . Nano-to micro-CT scanning and synchrotron-based tomography provide powerful tools for addressing questions of ontogeny, morphology, ecology, and phylogeny through submicron-scale volumetric resolution of fossils [3][4][5] . Population-level studies of 3D-morphological evolution have remained relatively rare, however, due to the time and data intensive nature of these approaches 2 . To address this gap, we have developed a high-throughput approach for extracting 2D and 3D shape information from photographic images called AutoMorph 6,7 and have used this technique to generate extensive image and shape data for modern planktonic foraminifera.
Planktonic foraminifera are mixotrophic protists with calcium carbonate tests found primarily in the sunlit layers of the global ocean 8,9 . Due to their abundant fossil record and importance in paleoceanographic research, planktonic foraminifera and other microfossil groups (i.e., coccolithophores, radiolarians, and diatoms) have been the focus of many (semi-)automated approaches for extracting information on factors like size, 2D shape, calcite thickness, and species [10][11][12][13] . Despite this long history of extensive imaging, there are few shared datasets consisting of the primary data (i.e., original images and measurements) of the many of millions of microfossil measurements and images made to date (see http://data.nhm.ac.uk//dataset/henry-buckley-collection-of-planktonic-foraminifera), likely due to difficulty of sharing large files in the past 14 . Even the growing number data aggregators and archives like iDigBio, MorphBank, MorphoBank, and Figshare, have remits and/or storage limitations that preclude the storage of large datasets like the one we described here. This data sharing gap is important because it precludes the data being re-used for other purposes, including documenting the range of morphological variation within planktonic foraminiferal species.
Here we provide an extensive image library of modern planktonic foraminifera, with accompanying 2D and 3D coordinate data and morphometric measurements from Atlantic Ocean core top sediment samples. Images of 61,849 complete and damaged planktonic foraminifera are provided along with accompanying 2D and 3D morphometric data for nearly all objects (i.e., 57,304 of the complete and damage planktonic foraminifera provided were successfully extracted for 2D and 3D shape). Images and shape data for another~37,000 classified objects is also provided in categories including planktonic foraminiferal fragments, pteropods, ostracods, etc. (see Methods for further details). We have withheld the object identities for 1/5th of the entire sample set (24,846 of the~124,000 total objects) so that these images can be used as the test set for automatic image recognition algorithms (i.e., machine learning).
We primarily sampled morphological variation in the North Atlantic for practical and theoretical reasons. The vast majority of the roughly fifty morphological species of extant planktonic foraminifera are found in all ocean basins and hemispheres 15,16 , with morphological and genetic differentiation across environmental gradients 9,17 . Thus, while the dataset presented here primarily describes North Atlantic variation, it should be broadly representative of global variation in community morphology. From a practical perspective, we sampled in the Atlantic in order to obtain the best-preserved fossils. The Atlantic Ocean has far more well-preserved, carbonate-rich deposits, due to younger (i.e., less acidic) bottom waters and shallower average depths than the Pacific and Indian Oceans 18,19 . Preservation was important to ensure that we captured variation in morphology arising primarily from processes acting on living, rather than dead 20,21 , foraminifera.
Fossils were imaged and shapes extracted using automated slide scanning and a high-throughput image processing pipeline (AutoMorph), developed in-house to rapidly extract 2D and 3D shape information from light microscopic images 6,7 . Because the AutoMorph pipeline is relatively new, we describe in detail our sampling and imaging protocols for samples dominated by planktonic foraminifera. Relevant usage notes for this dataset are also provided. The AutoMorph software is available and frequently updated on GitHub (https://github.com/HullLab), and the images and shape data are available on Zenodo (http://doi.org/10.5281/zenodo.165514). The publically available dataset presented here provides the most extensive images, 2D and 3D shape documentation of the range of morphological variation observed in recent planktonic foraminifera to date, and provides a baseline for considering variation in morphology across both time and space.

Sample Selection and Preparation
Discerning the relative importance of environment, preservation, and biotic interactions on patterns observed in fossil assemblages often requires considering individual fossils in the context of their sedimentary environment and broader sample assemblage 22 . With this in mind, we imaged entire assemblages of fossils and sedimentary constituents from our 31 core top sediment sites from the North Atlantic and three core top sites from the South Atlantic (Fig. 1). Sites were chosen so as to span the five major planktonic foraminiferal faunal provinces identified by previous authors 15  (depth below the sediment-water interface) varied from a minimum of 0-0.5 cm core depth to a maximum of 0-3 cm (Table 1), with broader depth ranges generally corresponding to a greater amount of geological time captured by the sample. Some of the core tops provided by B.H. Corliss had age estimates determined on the basis of benthic foraminifera oxygen isotopes 23 (Table 1).
All core top samples were sieved to obtain the >150 μm fraction. The > 150 μm fraction was then subsampled with a micropaleontological microsplitter down to~5000 objects (primarily composed of planktonic foraminifera). Subsampled objects were arranged and lightly glued to plain black micropaleontological slides using a binocular stereo microscope (Fig. 2). Foraminifera were oriented with the umbilical side facing up, and fragments and ostracods were oriented with the concave side up. We aimed to mount~1000 objects per slide, in order to prevent adjacent objects from touching, for a total of between 3-6 prepared slides for each subsample of~5000 objects. In all, 155 slides were prepared from the 34 sites (Table 2 (available online only)).

Imaging
Prepared slides were imaged using a 5-megapixel Leica DFC450 digital camera mounted on a Leica Microsystems DM6000M compound microscope with a drive focus and motorized x-y scanning stage. The microscope system is controlled by Surveyor Software (Version 7.0.1.0, Objective Imaging Ltd) run on a Dell computer (3 TB Solid-State Drive, 3.7 GHz processor) coupled to an OASIS-blue 3 Stage Controller (Objective Imaging Ltd) and a 5-megapixel Leica DFC450 digital camera. Three slides were prepped and scanned at a time using Surveyor's multi-slide scanning mode (i.e., Navigator mode), which allows for multiple user defined scanning regions and variable background heights. Under our imaging pipeline, every slide scan generates a stack of raw slide images (called planes) at different z-axis heights. The number of planes per slide depends on the z-step size and the z-range (i.e., vertical extent of the volume imaged). All slides in this study were imaged with a z-range of 950 μm and a z-step size of 31.1 μm. Every slide region defined in Navigator was imaged and saved as a series of BigTIFFs: one BigTIFF for every z-plane through the slide and a single extended-depth-of-field (EDF) composite image. The BigTIFF image format is an extension of the more common Tiff file format, but is designed for large images (>4GB). In this study, all slides were imaged with a 5x objective and illuminated with dark field lighting.  AutoMorph (automated morphometric post-processing) Imaged slides were processed with the AutoMorph software package (http://github.com/HullLab), a bioinformatics pipeline designed to segment individual objects from light images and extract 2D and 3D shape information 6,7 . There are four major routines in AutoMorph: segment, focus, run2dmorph, and run3dmorph. The first two routines (segment and focus) identify all the unique objects in a raw image (i.e., a scanned slide), extract and label all the objects from the raw images, and save the individual z-slices in unique directories, generating a single best 2D extended depth of focus (EDF) image for each object. Two different programs can be used to generate the best 2D EDF: the commercially available Zerene Stacker (ver. 201404082055) and the open source ImageJ. We generated all 2D EDFs with Zerene Stacker because it consistently produced better EDF images. The second two routines (run2dmorph and run3dmorph) extract shape coordinates and basic measurements in 2D and 3D, respectively, along with images of the 2D and 3D shape extraction for quality control. This software package is freely available on GitHub (http://github.com/HullLab) and the methods are described in detail in two publications 6,7 . Because we developed AutoMorph to generate this data set, code updates were made over the course of the project. Code versions and processing dates are listed in Table 2 (available online only) to track these revisions. AutoMorph is adapted to run on local computers and clusters, and this dataset was generated using both. Once slide images were processed, all unique objects were identified by human observers (PMH and LEE) to one of 16 categories (Fig. 3) using classify (available at http://github.com/HullLab). classify is a custom-made application for viewing and assigning general object information to images (Supplementary Table 1). Supplementary Table 1 lists the classification (Object Type) of all objects (listed by Object ID) by slide (YPM Catalog Number), along with classification confidence (Certainty). In total, 124,230 unique objects from 155 slides were segmented and classified (Supplementary Table 1, and  Table 3). In Supplementary Table 1, we provide information for 4/5 ths of the sample set (99,384 objects). The remaining 1/5 th of the sample set (i.e., 24,846 objects) is listed only as 'withheld' so that these objects can be used to test machine learning algorithms trained on the dataset provided here. Table 3 indicates the number of objects in each object category by sample site.

Data Records
Metadata and images are provided for all 124,230 objects in the data set, with 2D and 3D shape information successfully extracted from 109,198 objects. Of the 61,849 complete and damaged planktonic foraminifera identified here with images and metadata, 57,304 also have accompanying 2D and 3D shape information and an additional 2,500 have 2D shape information only. The tables in this data report provide relevant metadata, summary statistics, and details on the technical validation of measurements. Sample identity, location, source, and handling information is provided in Table 1 Table 1 and summarized in Table 3. Because 2D and 3D shape and size information was extracted automatically, Table 4 provides technical validation for ten objects measured by stage micrometer, in ImageJ, and with all the various AutoMorph code versions used in this study. The data products of this research are all available on Zenodo (Data citation 1). The Zenodo data citation includes nine distinct data types uploaded as 13   ix) 3d_obj_files_part1.tar.gz, 3d_obj_files_part2.tar.gz, and 3d_obj_files_part3.tar.gz: combined, parts 1,2, and 3 contain 109,207 3D shape coordinate files (.obj files) of objects successfully extracted for 3D shape and the text file of failed 3D extractions (unextracted_objects_3D.txt) The first data product, the slide images of boxed objects, is also available in a low resolution version on the Yale Peabody Museum's collection portal (http://collections.peabody.yale.edu/search/), under the division of Invertebrate Paleontology, by searching with the YPM collection number listed in Table 2 (available online only).

Technical Validation
Technical validation occurred at a number of steps in the image processing pipeline, and included object selection, shape extraction, size measurements, and object classification.

Object Selection
The AutoMorph segment module saves a slide overview (a low resolution EDF) with each identified individual object boxed in red ( Fig. 2; full   data citation). To verify that all microfossils were identified and selected from each slide, we visually checked the boxed slide output. Image selection parameters in segment were adjusted as needed to optimize object selection. For a given set of image segmentation parameters, object selection is deterministic (i.e., the same objects are identified in the same order with every software run). The deterministic nature of the object selection software was verified by re-segmenting three slides twice and one slide three times and confirming the number and identity of objects. The number of objects outputted by segment were then cross checked with the number of objects outputted by all following modules (focus, run2dmorph, and run3dmorph) for each slide.

Shape Extraction
2D EDF images of individual objects were generated by the focus module and this output was checked by eye for the first 100 objects in each slide to ensure proper image compositing (see edf_images.tar.gz in data citation). 2D and 3D shape extraction occurred along 2D outlines and 3D meshes of individual objects. The quality of 2D shape extraction was checked visually for the first 200 objects in each slide using outline-object overlays (see 2d_outline_check.tar.gz in data citation) and run2dmorph parameters were adjusted, when necessary, to optimize the efficacy of 2D outline extraction. Similarly,  the quality and parameters of 3D shape extraction was checked visually using 3D PDFs of object meshes (see 3d_pdfs.tar.gz in data citation). Both run2dmorph and run3dmorph output lists of objects with failed outline (or mesh) extractions. These lists were examined for each slide to ensure that complete foraminifera were included and that specific species were not being disproportionately missed. When problematic (e.g., a large number of complete foraminifera failed to extract), the routines were re-run with different image extraction parameters to ensure the best possible 2D extraction. The same set of image extraction parameters yielded satisfactory results for 3D shape extractions of complete planktonic foraminifera from all samples.

Size Measurements
The accuracy and reproducibility of 2D and 3D size extraction was confirmed with direct measurements. For run2dmorph, a calibration slide (IP.307866), containing ten complete planktonic foraminifera from four species, was used to check 2D size extraction (Table 4). This slide can be viewed in the YPM collections digital database (http://collections.peabody.yale.edu/search/). In total, ten complete planktonic foraminifera from four species were measured along their minor and major axes using a stage micrometer on a Leica S8APO microscope. The calibration slide was also segmented with each of the three code versions of the segment module of AutoMorph, and then processed through run2dmorph to obtain automated measurements of the major and minor axis for each individual foraminifer. The same individuals' major and minor axis lengths were also measured in ImageJ using each of the three segment outputs. To do this, the ImageJ scale was set using the automatic scale bar added to the image label by segment, and the major and minor axes were drawn by hand. The three measurement types (run2dmorph, ImageJ and stage micrometer) were then compared (Table 4 and Fig. 4). Fig. 4a and b illustrate the relative reproducibility of the fully automated measurements (Fig. 4a: AutoMorph, three segment code versions) versus traditional ImageJ measurements (Fig. 4b: ImageJ). In both panels, object measurements are normalized to the mean measurement to highlight the variation between repeated measurements and the relative reproducibility of both approaches. AutoMorph (Fig. 4a) clearly outperforms hand measurements (Fig. 4b: ImageJ) in reproducibility, although both approaches have no significant difference between batches (AutoMorph one way ANOVA F(2,55) = 0.0154, p = 0.985; ImageJ one way ANOVA F(2,57) = 0.00058; p = 0.999). The small amount of variation that does exist between repeated AutoMorph measurements is due to a switch between a MATLAB code base (the original modules, segment versions 9-3-2014b and 10_27_2015) and a Python code base (segment version 6_17_2016). MATLAB code versions gave identical results, and all Python output was within 0.49 microns of the MATLAB output (Table 4). Repeated hand measurements in ImageJ had as much as a 16 micron difference between measurements. Importantly, all three approaches (AutoMorph, ImageJ, and measurement with a stage micrometer) provide the same average 2D measurements for foraminifera (Fig. 4c). Averaged AutoMorph output and stage micrometer measurements by specimen, as well as averaged AutoMorph output and ImageJ measurements by specimen were not significantly different (ANCOVA F(6,131) = 0.036; p = 1). Together, these tests indicate that AutoMorph provides accurate and reproducible 2D measurements of foraminifera. The accuracy and precision of 3D size extraction was  Species and Specimen) was measured along its minor and major axes with a stage micrometer on a Leica S8APO microscope, and in ImageJ using the scale bar added by the segment module of AutoMorph. Automated size measurements from run2dmorph are also provided as 'AutoMorph μm' for each foraminifer.
In one instance, run2dmorph failed to extract the object outline as indicated by the n/a. previously assessed 7 by comparing the height extraction with the length and width of spherical objects and by examining the effect of object orientation and imaging conditions on 3D mesh extraction and volume estimation (see ref. 7 for details). These tests indicated height extraction within 7.6% of the major and minor axis lengths for spherical objects.

Object classification
Extensive spot checks of final EDF image classifications found object classification by human observers to be 99.95% accurate with different types of errors characterizing each classification category. The errors are described briefly here (category listed in quotes followed by a list of object-types included in error), with each classification category described in more detail in Usage Notes. Noted classification errors include 'agglutinated': clipped and unknown; 'benthic': clipped; 'clipped': mollusk and unknown; 'complete': damaged, clipped, and touching; 'damaged': complete; 'echinoid spine': unknown; 'mollusk': clipped and touching; 'fragment': unknown, mollusk, touching, and radiolarian; and 'radiolarian': clipped; 'rock': agglutinated. Chunks of consolidated sediment were generally poorly classified. The proper classification of a sediment chunk should be 'rock', a category which includes rock-like objects, but sediment chunks occurred in 'agglutinated', 'touching', 'unknown', and 'rock'. Notably, as a category, 'rock' contained far more rock-like objects than actual lithic fragments. The occurrence of small foraminifera nested within complete, damaged and/or fragments of larger foraminiferal tests was similarly problematic. These combinations were assigned the classification of the larger object in cases where the small foraminifera were completely nested within the outline of the larger object. In cases where the small foraminifera obscured the outline of the larger object, the total image was classified as 'touching'.

Usage Notes
The splits of core top samples used in this study were, to our knowledge, unbiased by previous research efforts undertaken on the material, with exception to the benthic foraminifera. Many of the samples were picked for specific species of benthic foraminifera in the past, so all benthic foraminifera results should be considered as illustrative of some of the species present but not necessarily quantitative representations of their original abundance or full diversity in the samples. More generally, it is worth noting that most of the core top samples used here have a long collection history in other laboratories, so it is possible that selective sampling of some planktonic foraminifera or other species occurred in the past without our knowledge. Besides this effect, it is worth reiterating that the assemblage data provided here comes from death assemblages. In spite of visual evidence for good preservation in most of the core top samples included, selective dissolution of small-bodied and delicate species is known to begin even in the water column 20,21 , and the assemblages imaged are certainly time-averaged on the scale of hundreds to many thousands of years.
Objects that failed to properly extract for 2D and/or 3D shapes are listed in each of the appropriate data files (i.e., data citation files 2d_outline_check.tar.gz, 2d_coordinates.tar.gz, 3d_pdfs.tar.gz, and 3d_obj_files.tar.gz). Although we include all images extracted by segment in this dataset, do note that our initial sieve size was 150 microns. Although there are a number of objects smaller than 150 microns in this dataset, they are not representative of the abundance of this size category in the original sample. Rather, they are the rare objects that slipped through our size filter, and should be excluded for most applications. At least one ancient fossil appears in the core top data set. We have left this ancient fossil in as an indication of the level of cross-contamination in the lab (very low but potentially present). It is also possible that this stratigraphically out of place foraminifera was reworked in the sediments or introduced during handling in other labs. Regardless, users should remove this such outliers in species-specific applications.
Samples from YPM Sites IPE.08282, IPE.08285 and IPE.08295 were sized fractioned when received and different sized splits were taken from each size fraction. Here we described the post-processing that we carried out to insure that images from these samples accurately reflect species and size distributions at those sites. YPM Site IPE.08282 arrived in three sample jars containing, respectively, the 125-250 μm size fraction, the 250-315 μm size fraction, and the greater than 315 μm size fraction. The 125-250 μm size fraction was sieved over a 150 μm sieve and a 1/64th split was mounted on four slides (IP.308160, IP.308161, IP.308162, and IP.308163); a 1/32nd split of the 250-315 μm size fraction was mounted on two slides (IP.307847 and IP.307848); and a 1/32nd split of the greater than 315 μm size fraction was mounted on three slides (IP.307849, IP.307850 and IP.307851). This size-fractionated handling of these sites (Sites IPE.08282, IPE.08285 and IPE.08295) is problematic because it introduces a bias by overrepresenting certain size classes in the imaged object output. In the case of YPM Site IPE.08282 the largest two size fractions (the 250-315 μm and the greater than 315 μm size fractions) were over-represented by a 1/32nd split relative to the smallest size fraction (150-250 μm size fraction with a 1/64th split imaged). To correct for this bias, it was necessary to subsample the object output from these slides to properly represent the relative distributions of objects in the original sample.
More specifically, for IPE.08282 half the objects were randomly selected and discarded from the combined object list of IP.307847 and IP.307848 (the 250-315 μm size fraction) and from the combined object list of IP.307849, IP.307850 and IP.307851 (the greater than 315 μm size fraction), so that all size fractions contained a~1/64th split of objects from the original site sample. For YPM Site IPE.08285, the largest two size fractions (the 250-315 μm and the greater than 315 μm size fractions) were overrepresented by a 1/2nd split relative to the smallest size fraction (150-250 μm size fraction with a 1/16th split imaged). To obtain a 1/16th split across size fractions, one in every eight objects (12.5%) was randomly selected from the combined object lists of IP.307857 and IP.307858 (the greater than 315 μm size fraction) and from the combined object lists of IP.307859 and IP.307860 (the 250-315 μm size fraction). For YPM Site IP.08295, the largest size fraction (the greater than 250 μm size fraction) was over-represented by a 1/32nd split relative to the smallest size fraction (150-250 μm size fraction with a 1/256th split imaged). To obtain a 1/32nd split across size fractions, one in every eight objects (12.5%) was randomly selected from the combined objects in IP.307853 and IP.307854 (the greater than 250 μm size fraction). This data report includes objects after down-sampling and should be corrected for the bias introduced during slide preparation.
Each object was classified by a human observer according to one of sixteen categories (Supplementary Table 1), along with an indication of confidence in the classification: 'very', 'somewhat', and 'not'. In a few classification categories, the confidence categories were used to indicate other attributes; these exceptions are explained below. Classification categories, listed in Fig. 3, were defined as follows. 'Agglutinated' indicates a complete agglutinated foraminiferal test, or some part thereof. Low confidence in this category (i.e., 'agglutinated', 'not') typically occurred when the agglutinated fragment was so small as to make it difficult to distinguish between an individual rock and individual foraminifera. 'Benthic' denotes any clearly identifiable piece of a benthic foraminifer (i.e., complete, damaged or fragment of a benthic foraminiferal test). Lower to low confidence (i.e., 'somewhat' or 'not') in the 'benthic' assignment arose when test fragments were too small to confidently assign or when individuals were too small or indeterminate to assign to either benthic and/or planktonic foraminifera categories. 'Clipped' indicates any image with at least one edge of the object clipped, with the exception of objects in the category 'spicule' as described below. 'Complete' indicates complete tests of planktonic foraminifera: a category that includes dirty tests (stained and/or visibly covered with some amount of sediment), but not tests that are broken or fragmented. The three confidence categories for 'complete' planktonic foraminifera were used in a non-standard way: i) 'very' indicates objects identified as complete planktonic foraminifera with high confidence; ii) 'somewhat' indicates all small bodied and juvenile individuals, where confident assignment to benthic or planktonic habitats was difficult; and iii) 'not' indicates planktonic species Hirsutella scitula and Hirsutella theyeri and similar looking benthic foraminifera. Damaged tests of planktonic foraminifera were classified as 'damaged' for all breaks, drill-holes, and damage assessed to affect less than around a third of the test. All cases of severe damage to planktonic foraminifera, including small planktonic foraminiferal fragments, were classified as 'fragment'. The 'diatom' category contains diatom frustules, the 'echinoid spine' category contains echinoids spines, the 'mollusk' category contains mollusks, the 'ostracod' category contains ostracods, and the 'radiolarian' category contains radiolarians. In each of these (diatom, echinoid spine, mollusk, ostracod, and radiolarian), complete or large fragments of organisms were typically identified with greater confidence than small or out-of-focus pieces. Echinoid spines were confirmed as echinoid in nature by the match of the distinctive lattice structure in spine images with those of an immature echinoid in the YPM Invertebrate Zoology collection (YPM IZ.087653). The 'unknown' category contains non-target items, such as bits of background from the slide, fibres, and other unknown objects. Small pebbles, minerals, and other rock-like objects were categorized as 'rock'. Sponge spicules, categorized as 'spicule', were almost always clipped by the automated image www.nature.com/sdata/ SCIENTIFIC DATA | 5:180109 | DOI: 10.1038/sdata.2018.109 segmenting routine. As a result, we included all clipped images of spicules in the category 'spicule' in spite of the incomplete nature of the image. All ichthyoliths (including fish teeth, shark dermal denticles, and other pieces of apatite) were categorized as 'tooth', with notably few actual teeth in this dataset. Small pieces of apatite and other ichthyoliths can be very difficult to identify, so many are likely categorized as 'unknown' or 'rock'. Finally, when two or more objects touched, they were categorized as 'touching'. Objects in direct or very near contact cannot be accurately extracted for 2D and 3D morphometrics.