Digital light microscopy provides powerful tools for quantitatively probing the real-time dynamics of subcellular structures. Thorough documentation and quality assessment are required to ensure that imaging data may be properly interpreted (quality), reproduced (reproducibility) and used to extract reliable information and scientific knowledge that can be shared for further analysis (value). In the absence of community guidelines and tools, it is inherently difficult for manufacturers to incorporate standardized configuration information and performance metrics into image data and for scientists to produce comprehensive records of imaging experiments.

To solve this problem, the 4D Nucleome Initiative (4DN)1,2 Imaging Standards Working Group (IWG), working in conjunction with the BioImaging North America (BINA) Quality Control and Data Management Working Group (QC-DM-WG)3,4, here propose flexible microscopy metadata specifications for light microscopy5,6,7 that cover a spectrum of imaging modalities and scale with the complexity of the experimental design, instrumentation and analytical requirements. They consist of a set of three extensions of the Open Microscopy Environment (OME) Data Model8,9, which forms the basis for the ubiquitous Bio-Formats library9. Because of their tiered nature, the proposed specifications clearly define which provenance10 and quality-control metadata should be recorded for a given experiment. This endeavor is closely aligned with the recently established QUAlity Assessment and REProducibility for Instruments and Images in Light Microscopy (QUAREP-LiMi) global community initiative11,12,13. As a result, the ensuing 4DN-BINA-OME (NBO) framework5,14, alongside three interoperable metadata collection tools being developed in parallel (OMERO.mde, Micro-Meta App and MethodsJ2)15,16,17,18,19,20, represents a major turning point toward increasing data fidelity, improving repeatability and reproducibility, easing future analysis and facilitating the verifiable comparison of different datasets, experimental setups and assays. The intention of this proposal is therefore to encourage participation, constructive feedback and contributions from the entire imaging community and all stakeholders, including research and imaging scientists, facility personnel, instrument manufacturers, software developers, standards organizations, scientific publishers and funders.

Introduction

The reproducibility crisis affecting the biological sciences is well documented21,22,23,24,25. In the field of light microscopy, it can only be addressed if all published images are accompanied by complete descriptions of experimental procedures, biological samples, microscope hardware specifications, image acquisition settings, image analysis parameters and metrics detailing instrument performance and calibration9,22,26,27. This complete description, also known as image metadata, consists of any and all information about an imaging experiment that ensures its rigorous interpretation, reproducibility and reusability, and should be recorded in scientific publications and alongside the actual image data in the file header or in supplementary files6,7. A fully developed metadata model would provide for consistent tracking of crucial information pertaining to the quality, reproducibility and scientific value of image data, and will allow the communication and comparison of such information in a Findable, Accessible, Interoperable and Reproducible (FAIR) manner6,28 (see also Text Box I in ref. 6). However, as microscopy has evolved from a tool that generates purely descriptive or illustrative data to primary quantitative data acquired with ever more sophisticated and complex instruments, our practices for recording this quantitative data and metadata faithfully and reproducibly have not kept up.

The OME consortium29,30 has made significant advances with the development of the OME Data Model8,9, which, together with the ubiquitous Bio-Formats image file format conversion library9, serves as the only available de facto specification for accessing and exchanging image data. Nonetheless, the field of light microscopy still lacks much-needed community-mandated standards for imaging data and specifications for metadata (i.e., microscopy image data standards; Fig. 1)8,9, resulting in an unmanageable growth of proprietary and/or incompatible image file formats and metadata capture practices.

Fig. 1: The definition of community-driven microscopy image data standards requires three complementary components and needs a flexible framework to manage complexity.
figure 1

a, The establishment of community-driven microscopy image data standards requires development on three interrelated fronts: (1) community-driven specifications for WHAT microscopy metadata information about an imaging experiment is essential for rigor, reproducibility and reuse and should therefore be captured in microscopy metadata (pink bubble); (2) shared rules for HOW the (ideally) automated capture, representation and storage of microscopy metadata should be implemented in practice (yellow bubble); and, last but not least, (3) next-generation file formats (NGFFs) WHERE the ever-increasing scale and complexity of image data and metadata would be contained for exchange36,37; blue bubble). b, The 4DN-BINA-OME specifications for WHAT hardware specifications, image acquisition settings and quality-control metrics should be reported articulate along three complexity axes: (1) guideline tiers: the three guideline tiers are employed to scale reporting requirements with experimental complexity; (2) model core vs. extensions: the use of the core of the OME Data Model vs. one or more of the 4DN-BINA extensions allows capturing different microscopy modalities; (3) metadata-requirement levels: the distinction between Must Use and Should Use metadata fields is used to define what information is needed for different reporting purposes (i.e., quality, reproducibility, sharing value). Depicted is the intersection between the three dimensions (OME Core + 4DN-BINA basic and calibration extensions ∩ Tier 2 ∩ All available fields, where ∩ signifies intersection) that would be appropriate to describe an experiment in which a wide-field microscope is used to capture the dynamics of viral particle trafficking within infected cells.

This manuscript is intended to launch a community-driven way forward to break the impasse. Specifically, it puts forth scalable specifications for light microscopy metadata developed jointly by the 4DN1,2 IWG and by the BINA QC-DM-WG3 to extend the OME Data Model8,9 (Figs. 1 and 2). In order to foster widespread adoption of the 4DN-BINA-OME5 framework (Fig. 1a, pink bubble), key components of this effort are (1) user-friendly and when possible automated metadata-collection software tools (OMERO-mde, MethodsJ2 and Micro-Meta App) that are presented in parallel manuscripts15,16,17,18,19,20 and are coupled with standards for metadata representation and storage (Fig. 1a, yellow bubble)31,32,33,34,35; and (2) sustainable roadmaps for the switch from proprietary image data formats to common, cloud-ready OME Next-Generation File Formats (NGFF, Fig. 1a, blue bubble)36,37. Importantly, all of these activities are expected to be carried out in the context of QUAREP-LiMi11,12,13 and involve key members of the community, including microscope users, custodians and manufacturers, imaging scientists, national and global bioimaging organizations, bioimage informaticians, standards organizations, funders and scientific publishers.

The proposed 4DN-BINA-OME Microscopy Metadata (Fig. 2) Specifications articulate along three mutually independent axes (Fig. 1b).

  1. 1.

    Guideline tiers—metadata specification (Fig. 3): a system of adaptable tiers that spells out which specific subset of metadata information should be included depending on experimental context and intent, technical complexity and image analysis needs.

  2. 2.

    Core model and extensions—metadata extension (Fig. 4): a suite of extensions that expand the core of the OME Data Model8,9 to comprehensively capture state-of-the-art transmitted light and wide-field fluorescence microscopy (Basic extension) and confocal and advanced fluorescence modalities (Advanced and Confocal extension). Importantly, to improve the management of quality control, a novel data model for capturing instrument calibration procedures (Calibration and Performance extension) was developed in close collaboration with QUAREP-LiMi11,12,13.

  3. 3.

    Μetadata-requirement levels—metadata inclusion (Fig. 5): inherent flexibility in the inclusion of metadata is built in the model so that specific pieces of information will be considered as “required” (essential for rigor and reproducibility; Fig. 1b, Must Use) or “recommended” (useful to improve image quality and to maximize scientific and sharing value; Fig. 1b, Should Use).

Although 4DN-BINA-OME is inherently adaptable (Fig. 1b), it provides all community stakeholders with clear and enforceable community-driven mandates for which information is required to ensure scientific rigor, experimental reproducibility and maximal scientific value.

The metadata challenge in microscopy: the great variability of data formats and metadata reporting practices

The introduction of digital light detectors and computers has drastically improved the objectivity of optical observations and changed light microscopy in three profound ways. First, it has led to digital image formation, signal processing and computational methods that enable the extraction of quantitative information from images and that have transformed light (and, in particular, fluorescence) microscopy into a key quantification tool for biomedical research. Second, it has allowed the increasingly accurate recording of progressively lower amounts of light signal, enabling the visualization and quantitative measurement of subcellular and single-molecule (SM) events and molecular interactions with high specificity and temporal resolution. Third, it has enabled imaging modalities, such as confocal laser-scanning microscopy (CLSM) and super-resolution (SR) imaging techniques, that allow high-resolution imaging of fixed and live samples in three dimensions.

Despite these advances and the use of ever-more-sophisticated and complex instruments, practices to faithfully and reproducibly record quantitative image data and metadata have not kept up, thus exacerbating existing challenges of quality control and reproducibility. The quality and scientific value of imaging data should be assessed not only based on the extent to which they can be used to answer the questions it was intended to address, but also on the extent to which they can be trusted and reused by others. It follows that in performing imaging experiments, scientific rigor is inextricably tied to image quality, the reproducibility of experimental results and the degree to which datasets can be integrated with other data and further analyzed to answer new questions.

Deriving valuable and rigorous information from images is completely dependent on the consistent recording and storage of information that captures the origin and subsequent processing of the data (i.e., “data provenance”)10, as well as metrics that quantitatively assess the quality of the microscope and of the images (i.e., “quality control”)6,7,11,12. A typical light microscopy experiment includes three (sometimes integrated) major steps centered around the production of image data (Fig. 2): (1) sample preparation, i.e., all sample preparative steps for imaging; (2) image data acquisition, i.e., light detection, image formation and recording; and 3) image analysis, i.e., the post-acquisition processing and quantification of images. Each procedure within these steps can add considerable variability to the final data. Thus, to document all possible sources of uncertainty, images need to be accompanied by image metadata6 describing any and all information that allows the actual image data (i.e., quantitative values associated with the image pixels; Fig. 2, pixel image data) and imaging results to be evaluated, interpreted, reproduced, found, cited, compared and reused as established by measurable data quality criteria (i.e., FAIR principles)6,9,34. Fundamentally, image metadata can be defined as metadata that document all phases of a typical microscopy experiment (Fig. 2) from (1) experimental treatment, sample preparation and labeling (Fig. 2a, experimental and sample metadata)38,39 to (2) microscope hardware specifications, image acquisition settings, microscope performance metrics and image data structure (Fig. 2a, microscopy metadata)6; to (3) details about any image analysis procedure employed to extract quantitative information from the images (Fig. 2a, analysis metadata)40,41,42. As such, microscopy metadata consist of a subset of image metadata and, in turn, can be subdivided into two subcategories6,7: (1) microscopy data provenance metadata (MPM) describing the origin of the data microscope hardware specifications, image acquisition settings and image structure (Fig. 2a, provenance); and (2) microscopy quality-control metadata (MQM) including calibration metrics that quantitatively assess the performance of the microscope (Fig. 2a, quality control). In addition to capturing MPM and MQM (Figs. 1 and 2), microscopy metadata standards should also address the following:

  1. 1.

    Light microscopy utilizes a vast array of adaptable modalities, each requiring the reporting of different metadata as well as diverging quality-control approaches.

  2. 2.

    A microscope’s theoretical performance and working conditions are difficult to assess and are often unknown to the average user.

  3. 3.

    The relevant hardware and software metadata can be difficult to retrieve from available documentation.

  4. 4.

    The paucity of automation and intuitive software tools make record-keeping unduly burdensome, forcing experimental biologists to choose between scientific rigor and productivity.

  5. 5.

    The variability of the file formats and the consequent need for raw data files to be converted into other formats before interpretation and comparison often yield a significant loss of metadata, or, worse still, inadvertently compromises the data during the conversion process.

Fig. 2: Light microscopy metadata are essential for the assessment, interpretation, reproducibility, comparison and reuse of the results of microscopy experiments.
figure 2

a, A schematic representation of a typical bioimaging experiment and the image metadata that must be collected to ensure the quality, reproducibility and scientific value of the resulting pixel image data (blue box). Specifically, imaging experiments and the associated metadata can be subdivided as follows: (1) sample preparation documented by experimental and sample metadata; (2) image data acquisition documented by microscopy metadata; and (3) image analysis documented by analysis metadata. In turn, microscopy metadata (pink boxes) can be subdivided in two categories as indicated: (1) provenance metadata include information that documents microscope hardware specifications, image acquisition settings and image structure and (2) quality-control metadata include metrics that quantitatively assess the performance of the microscope and the quality of image data and are obtained through the execution of specifically designed optical, intensity and mechanical calibration procedures. b, In order to capture and store microscopy metadata, the 4DN-BINA-OME Specifications presented here take advantage of the structure of the OME Data Model8,9, which serves as the de facto specification for the exchange of image data and metadata. Specifically, provenance metadata are stored into revised and extended versions of the <Instrument> and <Image> elements of the OME Data Model. On the other hand, quality-control metadata are stored using a newly designed Calibration and Performance extension of the same model.

Despite this apparent complexity, it is worth noting that the image acquisition step of an imaging experiment (Fig. 2) is eminently manageable and quantifiable, as long as the microscope and imaging system are properly documented, maintained and operated. Consequently, the development of community-sanctioned specifications for microscopy metadata that encompass MPM and MQM not only is essential for image data quality, reproducibility, and sharing value, but also should be easy to obtain, as described in more detail in an accompanying manuscript6.

Importance and potential pitfalls of standardization

The value of microscopy image data standards (Fig. 1) has been widely recognized, resulting in important efforts to establish best performance testing and instrument calibration practices43,44,45,46,47,48, to unify data-submission requirements from journals49,50,51,52 and to produce the exchange format between image data and metadata that forms the basis for this work8,9,30,36,37,53.

Nonetheless, existing efforts have not yet reached normative value, primarily because of the insufficiency of essential elements that are key components of this endeavor, including (1) coordinated community efforts that lead to an easy-to-understand consensus on what specifications should be followed to ensure scientific rigor for imaging experiments3,11,12,13; (2) software tools that make prescribed microscopy metadata models actionable by microscope manufacturers, custodians and users faced with the challenge of producing well-documented, high-quality, reproducible and reusable datasets, such as the ones being developed in complementary efforts (OMERO.mde, Micro-Meta App, MethodsJ2)15,16,17,18,19,20; and (3) available endpoints (i.e., deposition to image data repositories; data reuse pipelines) making the purpose and worth of good documentation clear to all members of the community53,54,55,56. As a result, it remains challenging, for microscope manufacturers, custodians and users, to determine which parameters are relevant to a given technique and imaging experiment, and best practice recommendations are often ignored because they are perceived as too expensive, complicated and cumbersome.

There is thus much to be gained from harmonizing the reporting standards in light microscopy. First, this would facilitate the documentation of any microscopy-based protocol, minimize error and quantify residual uncertainty associated with each step of the procedure (Fig. 2). This, in turn, would provide a wealth of valuable contextual information—collectively referred to as data provenance10—that would greatly increase the scientific and sharing value of the data. Such details would enable the reliable evaluation of scientific claims based on imaging data, facilitate comparisons within and between experiments, allow reproducibility and maximize the likelihood that data can be collated and analyzed by other scientists using current and future image processing and analysis methods. Furthermore, the increasing availability of public image repositories (for example, the Image Data Resource, IDR53; Electron Microscopy Public Image Archive, EMPIAR57; Bioimage Archive54; the Cochin Image Database58; the NIH (National Institutes of Health) Cell Image Library59; the RIKEN Systems Science of Biological Dynamics database, SSBD60), will undoubtedly increase the need for community-wide documentation and quality-control standards that can adapt to new technologies. As a first step in this direction56, the Recommended Metadata for Biological Images (REMBI)55 guidelines were recently proposed; these would maximize the possibility of making bioimaging datasets available to other researchers in a timely manner, consistent with FAIR principles15,20,28 and thus amenable to reuse.

Despite offering innumerable advantages, standardization also has its pitfalls. First, in the absence of software tools, it can significantly increase the administrative burden associated with imaging experiments. Second, because it is impossible to know a priori the complexity and diversity inherent to experimental details and imaging modalities that have yet to be developed, a lack of flexibility can severely limit the type of data that can be stored. It follows that it is critical that any proposed set of sustainable community specifications meet strict expandability requirements. Because of its inherent extensibility and the solid plans for its modernization (see Box 1), the OME Data Model8,9 provides a robust foundation for microscopy metadata (Fig. 2b) that can be extended by introducing information that is not yet covered (including experimental specific metadata, modality-specific metadata, quality-control metadata and analysis-specific metadata). As these extensions20,33,34,35,40,42 become more commonly used, they can be incorporated into the core model through community announcements and related vetting processes to ensure that they meet expanding community needs.

A three-dimensional matrix of 4DN-BINA-OME Microscopy Metadata Specifications

Given that a one-size-fits-all solution for microscopy metadata requirements is clearly not tenable, here we propose the 4DN-BINA-OME framework (available as described in Table 1), in which microscopy documentation and quality-control requirements are organized along three orthogonal axes that are largely independent from each other (Fig. 1b). The first axis is based on the observation that different types of experiments have different reporting and quality-control requirements based on technical complexity, experimental design, and image analysis needs. Hence, requirements along this axis are subdivided into tiers depending on the three criteria listed above (Fig. 1b, guideline tiers; Fig. 3, Table 2 and Supplementary Table 1). The second axis starts with the OME Data Model8,9 and extends it with additional metadata components that are introduced on the basis of microscopic modality (for example, epifluorescence vs. confocal microscopy) and accommodate expansion as new technologies are developed that are covered neither by the core nor by the currently proposed extensions (Fig. 1b, OME core vs. extensions; Fig. 4). Finally, the third axis grades documentation requirements based on whether each piece of information is essential for rigor and reproducibility (Must Use) or recommended to improve image quality and for maximizing scientific and sharing value (Should Use; Fig. 1b, metadata-requirement levels; Fig. 5). The existence of these three axes will allow institutions, funding agencies, consortia and scientific publishers to define best practices for light microscopy experiment documentation while concomitantly allowing individual scientists to find an appropriate position on the guideline matrix that both matches their needs and remains compatible with community-mandated guidelines. As an example, Table 3 lists where some representative experiments would fall within the microscopy metadata guideline matrix (Fig. 1b).

It should be noted that the 4DN-BINA-OME and REMBI55 metadata frameworks were developed in parallel and were deliberately designed to directly map with each other. Specifically, with the proviso that REMBI also defines metadata for electron microscopy and correlative imaging, regarding light microscopy the following correspondences exist between REMBI and 4DN-BINA-OME:

  1. 1.

    The REMBI “Instrument attributes” element maps with the <Instrument> core element of 4DN-BINA-OME (which captures Microscope Hardware Specifications metadata);

  2. 2.

    The REMBI “Image acquisition parameters” element maps with the <Image> core element of 4DN-BINA-OME, which captures Image Acquisition Settings metadata.

Table 1 Public availability of the NBO Microscopy Metadata Specifications
Fig. 3: Scaling light microscopy metadata guidelines with experimental, technical and analytical intent and complexity minimize recordkeeping burden while maximizing value, quality and reproducibility of image data.
figure 3

Shown is a schematic representation of the graded system for metadata specifications proposed by 4DN and BINA to tailor reporting requirements to experimental complexity. In this system, microscope hardware and imaging experiments are classified based on the following criteria: (1) experiment and image complexity, (2) microscope technology and imaging modality and (3) results and analysis requirements. For each criterion, the schema provides graphical illustrations of increasing complexity along the three-tier axis that can be used as an initial guide for microscope users in mapping reporting requirements to their experimental needs.

Fig. 4: The 4DN-BINA Light Microscopy Metadata Specifications extend the core of the OME Data Model.
figure 4

Portrayed are Venn diagrams containing a linked-open-data (LOD; Fig. 6b) representation of the core vs. extension relation between metadata model elements that belong to the core of the OME Data Model (OME) namespace (blue ovals) and those that belong to the three proposed extensions specified by the 4DN-BINA-OME (NBO) namespace (maroon, gray and green ovals). Specifically, the schema illustrates the relationship between the <Instrument> (a) and <Image> (b) elements (OME: Instrument, OME: Image; red-bordered blue ovals) and their subelements belonging to the core of the OME Data Model (light blue set containing blue ovals; for example, OME: Filter, OME: Channel, etc.), with subelements specified by the 4DN-BINA-OME extensions Calibration and Performance (light red set containing maroon ovals; for example, NBOC: LightSensor, NBOC: OpticalCalibration, etc.), Basic (light gray set containing dark gray ovals; for example, NBOB: OpticalAperture, NBOB: CameraSettings, etc.), and Advanced and Confocal (light green set containing green ovals; for example, NBOA: SpinningDisk, NBOA: PinHoleSettings, etc.). The schema is not intended to be comprehensive and includes only a small subset of the elements that comprise the model. AOBS, acousto-optical beam splitter; AOTF, acousto-optical tunable filter; LCTF, liquid crystal tunable filter; TIRF, total internal reflection.

Fig. 5: The third axis of the 4DN-BINA-OME Microscopy Metadata Specifications adds further flexibility to minimize the imaging experiment documentation burden.
figure 5

Depicted is an example Venn diagram representing attributes that are required to document the characteristics of an objective lens and are stored in the <Objective> element of the 4DN-BINA-OME Core and the Basic extension. In the schema, objective attributes are color-coded based on their tier level and are subdivided into requirement-level categories based on the following criteria: (1) required (MUST) fields are necessary to validate claims and reproducibility; (2) recommended (SHOULD) fields are prescribed to ensure maximal image quality and sharing value. Color-coding is consistent with that used throughout the manuscript: green, Tier 1; orange, Tier 2; dark blue, Tier 3.

Table 2 Tiers for light microscopy metadata reporting as proposed by the IWG of the 4DN initiative and by the QC-DM-WG of BINA

Because of this deliberate direct mapping, microscopy metadata specified by 4DN-BINA-OME intrinsically meets and exceeds the requirements imposed by REMBI for light microscopy55. Hence, the adoption of 4DN-BINA-OME Specifications (especially through the use of the complementary software tools being simultaneously presented in related manuscripts)15,16,17,18,19,20 would greatly facilitate the work of microscopists who want to deposit imaging data on BioImage Archive54.

The first axis: a tier-based system of guidelines for light microscopy metadata

To achieve rigor and reproducibility, increasingly elaborate imaging experiments require additional metadata on top of those required for more basic experiments. On this account, a graded system for metadata requirements not only is appropriate but also minimizes the burden of collecting metadata for each experiment while maximizing the opportunities for rigor, reproducibility, evaluation, analysis and comparison. We envision a flexible system in which different imaging communities (that is, individual research institutions, individual fields of knowledge or research consortia) would define their own sets of criteria whereby microscope hardware and imaging experiments are classified in tiers based on experimental and image complexity, microscope technology and imaging modality, and analytical requirements. Hence the tiered system of guidelines presented here (Fig. 3, Table 2, Supplementary Table 1 and Supplementary Information, Supplementary 4DN-BINA-OME Tier-system description - Axis 1)61 should be considered as an example of how different imaging experiment types could be placed on a complexity scale to facilitate the collection of the most appropriate minimum set of metadata required for reproducibility and comparison of each category. We expect that this system will evolve organically to incorporate new imaging modalities. Active international initiatives such as QUAREP-LiMi11,12,13 should help to ensure that new metadata specifications are agreed upon by the community and consistent with existing standards.

Table 3 Examples of utilization of the three-axis matrix of the 4DN-BINA-OME Microscopy Metadata Specifications

A robust, maximally useful and efficient metadata standard would be tailored around the different reporting requirements of experiments of increasing complexity. We suggest here a system composed of one descriptive tier (Tier 1) and two analytical tiers (Tiers 2 and 3; Fig. 3, Table 2, Supplementary Table 1 and Supplementary Information, Supplementary 4DN-BINA-OME Tier-system description - Axis 1)61, in which imaging instrumentation and datasets are classified based on the following sets of criteria:

  1. 1.

    Are results amenable to visual interpretation or is advanced image analysis (for example, subpixel SM localization microscopy, SMLM) required for the full understanding of results?

  2. 2.

    Are biological samples fixed or alive during acquisition?

  3. 3.

    Are any parts of the quantitative microscopy pipeline (microscope instrument, acquisition modality and image analysis) relying on novel, rather than fully established, technology?

  4. 4.

    Are the data provenance and quality-control metadata tracked, documented and reported by hardware manufacturers or instrument developers?

Consistent with minimum information principles, the system represents a minimal set of metadata required for each tier, covering only the information relevant for the interpretation of the specific imaging experiment (although more comprehensive information is always allowed and encouraged). As an example, the proposed specifications encompass information about the sample that directly affects the imaging conditions (for example, labeling method, mounting medium). However, because of the complexity of fully describing experimental and sample preparation procedures, such endeavors pertain more directly to the communities involved in the different research areas that utilize microscopy as an investigation method (cell biology, developmental biology, etc.) and are clearly beyond the scope of this effort. Although the initial impetus for developing such specifications will have to originate within individual research fields, coordination across domains will be necessary to develop consensus around overlapping areas and avoid splintering off in discordant directions. A detailed description of the 4DN-BINA-OME tier system61 is available in Supplementary Table 1 and Supplementary Information, Supplementary 4DN-BINA-OME Tier-system description - Axis 1.

The second axis: a suite of three 4DN-BINA-sponsored community-driven OME extensions

In its simplest form, metadata can be easily represented as lists of key-value pairs, in which the first term is a descriptive term for a specific attribute and the second term is the value of the attribute, including units for numerical values. However, lists of key-value pairs are often not sufficient to define rich metadata guidelines as they do not make it possible to capture the often complex relationships between different real-world components and situations. A better approach is the development of abstract models for the data that represent the scenario to be described. Ideally, such a data model would account for the components of the system, the attributes that need to be recorded for each component to be fully documented, and the relationship between components (Fig. 6a). A useful formalism for developing, describing and viewing an appropriate data model is the entity–relationship (ER) diagram62, which must subsequently be translated into formalized schemas and file formats (Fig. 6b) to facilitate the implementation of metadata capture and management tools.

Fig. 6: A data model is a schematic representation of reality that can be used to organize metadata and produce tools.
figure 6

a, Building visually compelling conceptual metadata models captures not only individual attributes and their values, but also the often-complex relationships between different real-world entities. Presented is a simplified entity–relationship (ER)62 depiction of the 4DN-BINA-OME data model that portrays the hardware configuration of a microscope. In this formal representation, (1) solid-lines boxes are used to symbolize individual hardware components (for example, <Light Source>, <Objective>, etc.); dashed-line boxes denote generalized element-families to which specific ‘child’ elements belong (i.e., a Laser belongs to the Light Source family). (2) Lists of attributes (key-value pairs, enclosed in a beige box) represent metadata that need to be recorded about each hardware component (for example, Magnification, Numerical Aperture, etc.). (3) Lines are used to model relationships between components. Specifically, blue lines represent ‘HAS-A’ relationships (i.e., “An Instrument HAS-A Light Source”); black arrows represent ‘IS-A’ relationships connecting generalized to specific concepts (i.e., “A Laser IS-A Light Source”). Based on the rules indicated above, the depicted schema can be read to signify: “This instrument has a laser, which is a specific type of light source, and has an objective built by Nikon, with 100× magnification and 1.4 numerical aperture.” b, Depiction of the process, starting from human-readable statements describing the data (Real life, top panel), that is often used to produce actionable code (Schema, bottom panel) used to build essential metadata capture and management tools. Statements are first rendered into graphical illustrations that provide a bird’s-eye view of the entire system, such as the linked-open-data (LOD) graph depicted here (Diagram, second panel from the top). Subsequently, diagrams are parsed to produce structured statements (Formal statement, second panel from the bottom) using one or more available methods (for example, English syntax, Key-value pair, Entity Relationship, XML Schema Definition, JSON-LD/RDF/OWL, etc.). Finally, statements are encoded using formal schema languages. In the example depicted (bottom panel), JSON-Linked Data (JSON-LD; https://www.w3.org/TR/json-ld11/) is used to serialize Resource Description Framework (RDF; https://www.w3.org/RDF/) triples to build extensible LOD information graphs.

Because of its status as the only existing exchange format for imaging experiments, the robustness of its design and its solid path forward toward modernization (Box 1; Fig. 2b; details in Supplementary Information, Supplementary OME Model Description), the OME Data Model (i.e., OME Core) represents the ideal starting point for the suite of 4DN-BINA extensions presented here (Fig. 4 and Supplementary Figs. 1–4). As such, the 4DN-BINA-OME specifications proposal consists of three extensions of the OME Core8,9, each incorporating the concept of graded documentation requirements based on a tiered system of guidelines (Fig. 3, Table 2, Supplementary Table 1 and Supplementary 4DN-BINA-OME Tier-system description - Axis 1). A first step toward this goal, the 4DN-BINA-OME Microscopy Metadata Specifications14,63,64, extend the core OME elements <Instrument> and <Image> (Figs. 2b and 4 and Supplementary Figs. 1–4) to reflect the technological advances and the quality control requirements associated with state-of-the-art transmitted light, wide-field and confocal-fluorescence microscopy. A detailed description of the system of three proposed 4DN-BINA-OME extensions is available in the Supplementary Information, Supplementary 4DN-BINA-OME Extension system description - Axis 2. In summary:

  1. 1.

    The Basic extension is designed to better capture the technical complexity of transmitted light microscopy and wide-field fluorescence, including subpixel single-particle localization and single-particle tracking experiments (Fig. 4, blue and gray elements; Supplementary Figs. 1 and 2).

  2. 2.

    The Advanced and Confocal extension is designed to better capture experiments requiring tunable optics and confocal microscopy (Fig. 4, green elements; Supplementary Figs. 1 and 3).

  3. 3.

    The Calibration and Performance extension introduces specifications for the capture of metrics required for microscope calibration and quantitative instrument performance assessment (Fig. 4, maroon elements; Supplementary Figs. 1 and 4).

Although it would be impracticable for the current version of the specifications to meet all emerging community needs, the proposed structure provides a flexible framework to easily accommodate future extensions that will be need to be developed in close collaboration with the community to capture sources of image data that our model does not yet fully define (such as light-sheet and Airy scan confocal microscopy).

To facilitate understanding of the 4DN-BINA-OME by all relevant members of the community regardless of their information science expertise, while at the same time ensuring machine readability, formal representations of the 4DN-BINA-OME extensions are maintained on GitHub14 in three formats (Table 1): (1) a set of graphical ER schemas to facilitate an overall understanding of the model structure64; (2) an Excel spreadsheet to express the details of the model in a human-readable form64; and finally (3) an XML Schema Definition (XSD) to represent the model schema in a machine-readable manner63.

The third axis: metadata requirement levels

Along the third axis (Fig. 5), individual metadata fields are classified based on requirement level as described by Internet Engineering Task Force Request for Comment (RFC) document 2119, produced by the Network Working Group65. The keyword MUST, or the terms REQUIRED or SHALL, mean that the definition is an absolute requirement to validate experimental claims and ensure reproducibility. The keyword SHOULD or the adjective RECOMMENDED mean that although there may exist valid reasons in particular circumstances to ignore a particular field, they are highly recommended to maximize image quality, scientific value and FAIRness28. Two examples of the use of the third dimension to add flexibility to the proposed 4DN-BINA-OME Microscopy Metadata Specifications are presented below.

  1. 1.

    Example 1: OME Core and 4DN-BINA Basic extension element <Objective> (Fig. 5)

    Whereas the Manufacturer, Model, Magnification and Numerical Aperture (Lens NA) of an objective are required to make it possible to interpret microscopy results and for reproducibility, other attributes such as a hardware component’s Lot Number, a Lens’s Back Focal Length and the Calibrated Magnification of an Objective are recommended to maximize image quality and scientific value, but they are not required because they are not essential for reproducing the experiment.

  2. 2.

    Example 2: 4DN-BINA Calibration and Performance extension element <MultiColorBeads>

    When using multicolored beads to prepare a colored-beads slide to use for the optical calibration of a microscope, the Manufacturer, Catalog Number and Concentration of the bead preparation alongside the Diameter of the beads are essential for the interpretation of the calibration results and for reproducibility. However, the beads’ Type and Material may be omitted because it can be argued that although that information improves the completeness of the data, it is not absolutely required for the correct interpretation of the results of the optical calibration procedure in which the beads are used.

Model implementation: recommendations for Materials and Methods descriptions

A recent exploration of the quality of published Methods sections in scientific articles containing images obtained with advanced microscopes found that the quality of reporting was poor, with some articles containing no information about how images were obtained, and many lacking important basic details22. Yet there is ample evidence that the publication of full details about how each image was obtained is vital for rigor, reproducibility and maximal scientific and sharing value23,24,25,50,51,52. In this context, the 4DN-BINA-OME Microscopy Metadata Specifications presented are intended to provide a major contribution toward the development of community-driven criteria for which information should be included in the Methods sections of scientific publications.

As a first step, in close agreement with parallel proposals50,52, we propose that microscopy metadata appropriate for Tier 1 (both Must and Should fields) should always be included in the Materials and Methods section of any journal publication to meet minimal rigor and reproducibility criteria22. As such, the generalized and automated availability of Tier 1 metadata, such as provided by the MethodsJ2 app19,17, could save considerable effort both for authors, who would not need to search for information scattered across different data-files, hardware setups and lab notebooks in preparation for publication, and for readers, who would not need to search the various sections of publications for information that may or may not have been included.

Model implementation: facilitated metadata collection

The importance of rich metadata to ensuring the quality, reproducibility, and scientific and sharing value of image data cannot be overstated. However, collecting rich sets of microscopy metadata is time consuming and, in the absence of active participation from hardware manufacturers, imposes an unfair burden on experimental scientists and is therefore difficult to enforce. Appropriate community-validated software tools and data management practices are essential to streamline and automate the documentation of microscopy experiments. In this context, in parallel with this proposal for microscopy metadata guidelines, a suite of three complementary and interoperable software tools are being developed and are presented in related manuscripts. (1) OMERO.mde15,20 focuses on facilitating the consistent handling of image metadata ahead of data publication and deposition based on shared community microscopy metadata specifications and according to the FAIR principles. In addition, OMERO.mde promotes the early development of image metadata extension specifications to allow their testing and validation before incorporation in community-accepted standards. (2) The Micro-Meta App16,18 focuses on an easy-to-use, graphical user interface (GUI)-based platform that interactively guides users through the process of building tier-based records of microscope hardware, accessories and image acquisition settings containing all relevant microscopy metadata as sanctioned by the community specifications such as the ones described here. Because of its graphical nature, the Micro-Meta App is particularly suited to enabling imaging scientists to enter all microscope metadata and use the tool to teach trainees about microscopy and the importance of microscopy metadata and to train microscope users in imaging facilities. (3) Finally, MethodsJ217,19 focuses on automating the process of writing Methods and Acknowledgements sections that are compliant with microscopy metadata guidelines for scientific publications involving microscopy experiments. MethodsJ2, by design, operates in concert to automatically import microscopy metadata from image data files using BioFormats and from the Micro-Meta App16,18.

Model implementation: information required for basic image interpretability

To ensure the basic interpretability of image data acquired before the adoption of community-sanctioned guidelines, any data that might be shared or published should, at the very least, contain the required metadata fields stipulated by the intersection between Tier 1 and the core of the OME Data Model. Thus, Tier 1/core sanctions the baseline metadata requirements for any light microscopy experiment to be interpretable, used and shared for scientific purposes54,55. Specifically, this includes minimal microscope hardware specifications (i.e., microscope, light source and objective manufacturer information and essential description) and essential information about the image structure (i.e., number of planes, channels and timepoints, pixel size, fluorophore name, emission and excitation wavelength, etc.).

Conclusions

Light microscopy images need to be accompanied by thorough documentation of the microscope hardware and imaging settings used in creating them to ensure correct interpretation of the results. A significant challenge to the reproducibility of microscopy results and their integration with other data types, such as chromatin folding maps generated by the 4DN consortium1,2,66, lies in the lack of standardized reporting guidelines for microscopy experiments, as well as instrument performance and calibration benchmarks22,23,24,51. Despite a growing consensus that such standards for light microscopy are desirable, previous efforts to develop shared microscopy data models and application programming interfaces8,9,30 have not yet succeeded in establishing universal sets of norms. In this manuscript, a framework to extend the OME Data Model is put forth to help address this challenge. In addition to aligning the OME Data Model to current technological developments, the specifications advanced here focus on maximizing usability via the introduction of a tiered system of documentation requirements; on an expandable suite of model extensions, which includes the first available data model for light-microscopy quality-control metadata; and on the flexible use of required and recommended fields.

Microscopy is not the only field in which recent technological advances have resulted in increasingly rich datasets. Recent examples are genomic DNA and transcriptomics RNA sequencing, which are, in fact, much younger fields than microscopy. Although protocols varied substantially in their early days (the original images from the sequencer were kept with the determined sequence), it took only about a decade to establish metadata requirements. One factor that helped establish such metadata criteria was the NIH Encyclopedia of DNA Elements (ENCODE) consortium67. The development of standard operating procedures (SOPs) and shared benchmarks (i.e., gold standards) within this group was pivotal for the establishment of agreeable standards for practical day-to-day use. In the interest of scientific progress and making data FAIR, data and metadata standards should not be dictated by individual laboratories or microscope manufacturers. Instead, they should emerge organically from discussions involving all members of the community who can benefit from standardization and be subjected to evaluation before adoption.

In this spirit, the initial draft Microscopy Metadata Specifications put forth by the 4DN1,2 IWG were evaluated and revised by the BINA QC-DM-WG3, resulting in the current proposal. Because it is inherently impossible to predict all future changes that might occur in the field of light microscopy field and in order to ensure rigor and reproducibility for image data now and in the future, it is clear that more work is needed to ensure that the 4DN-BINA (as well as future) extensions of the OME Data Model for bioimaging metadata proposed here continue to evolve as a result of regular exchange of information and views across the community. This is required to capture any future technical development in a manner consistent with current specifications while supporting FAIR data principles15,20,28. This is particularly important in the face of the establishment of a growing number of public image data resources56 such as the European IDR53, EMPIAR57, Movincell68 and Bioimage Archive54; the Japanese SSBD hosted by RIKEN60; and, in the US, the Allen Cell Explorer69, the Human Cell Atlas70 and the NIH-funded Cell Image Library59, Human BioMolecular Atlas Program (HuBMAP)71 and BRAIN initiative imaging resources72. These resources offer the opportunity to emulate, for light microscopy, the successful path that has led to community standards in the field of genomics39,73,74,75.

Because of the community nature of this effort, the 4DN-BINA-OME specifications must evolve first and foremost in alliance with the QUAREP-LiMi initiative11,12,13 to ensure that all participating imaging community stakeholders, importantly including microscope and software tool manufacturers (who are ultimately responsible for providing the information to be recorded in microscopy metadata), are involved from the ground up and provide timely feedback. In addition, the further development of the 4DN-BINA-OME Microscopy Metadata Specifications is being coordinated with other parallel initiatives, including the following.

  1. 1.

    The development of strategies and pipelines to integrate images and their metadata with -omics data from the same experiment, such as is underway as part of 4DN1,2.

  2. 2.

    The OME community development of general criteria and procedures to capture and store metadata in OME-NGFF (Box 1). The OME NGFF effort36,37 is implementing storage approaches to hold the binary pixel data and the metadata described herein in standardized, shareable, long-lived, efficient and performant containers (for example, files).

  3. 3.

    The EMBL-EBI development of the REMBI recommendations for metadata to be included with imaging datasets deposited to BioImage Archive54,55.

  4. 4.

    The development of the International Standards Organization (ISO) 23494-1 standard that will include the 4DN-BINA-OME (NBO namespace) Microscopy Metadata Specifications as part of a provenance information model for biological material and data31,32.

  5. 5.

    The development of online educational material, workshops and in-person courses in the context of BINA and in collaboration with Global BioImaging76 and other community partners77,78.

The specific purpose of this multipronged community effort will be to (1) educate microscope manufacturers, custodians and users about the importance of metadata standards and documentation to ensure image data quality, reproducibility and reuse value; (2) increase awareness about the 4DN-BINA-OME Microscopy Metadata Specifications proposed here and the complementary software tools for implementation developed in parallel efforts15,16,17,18,19,20; and (3) engage all major stakeholders (including those in the commercial, government and academic worlds) in our effort toward community-driven metadata standards for light microscopy. Initially, this mechanism will be used to generate a wider consensus around the current framework and lead toward the development of true community standards. A similar approach will be employed to engage representatives of different domains to generate microscopy metadata extensions and tier systems that best suit their research areas and avoid splintering off in multiple incompatible directions. As an example, more extensions will have to be defined to capture sources of image data that our model does not fully define, both in experimental (for example, light-sheet and Airy scan confocal microscopy) and synthetic image frameworks (for example, predictive multichannel image synthesis and super-resolution-level image restoration).

In conclusion, we are confident that because of its strong roots in the community, and because it is closely linked with the parallel development of easy-to-use interactive tools to facilitate metadata collection15,16,17,18,19,20, the flexible model framework presented here will provide a significant step forward toward the establishment of robust and future-proof metadata standards for light microscopy. With its key partnerships and increasing support from institutions and funding agencies, this work will continue to expand and help increase rigor and reproducibility in imaging data, rewarding everyone involved with improved trust in published results.