To the Editor—Advances in bioimaging over the past 20 years have been accompanied by developments in computational approaches for image reconstruction, analysis, classification and interpretation. Bioimaging has a broad range of applications addressing a variety of biological models at diverse scales of life; thus, descriptions of novel computational approaches are often focused on target case studies. Consequently, the conception and the development of a unified solution, able to tackle any scenario in biological imaging, are major challenges. Several types of architecture and tools have been proposed to surmount these technological difficulties. Although moving in the right direction, the existing software platforms (such as Fiji1, Icy2 and CellProfiler3), developed in various programming languages, are not all interoperable. Additional code development efforts are needed to gather various heterogeneous image-processing components in ad hoc workflows.

At the same time, although data storage on dedicated database servers, such as OMERO4 and others (for example, see https://imagerie.cochin.inserm.fr/sis4web/login.php), is becoming widespread, interaction with processing and analysis tools could benefit from further improvement. A more integrated approach to data organization, visualization and analysis is required to avoid the tedious task of manual management or scripting. In addition, managing often massive datasets requires dedicated expertise in computer science to scale up storage and computational resources. Now, with the emergence of artificial intelligence, such as deep learning, in bioimaging (for example, ImJoy5), the automation of processing tasks and the implementation of analysis pipelines that include image visualization (such as napari6), it is necessary to consider all stages of the data life cycle and new human–machine interactions. It is worth noting that data handling must now meet high quality criteria that will ensure identification, accessibility and interoperability of data with their processing, storage and analysis. These ‘FAIR’ principles7 impose new procedures and ethical obligations on scientists whose research relies on biological imaging, leading to a paradigm shift in the production of knowledge through image interpretation.

To meet all these requirements, we developed BioImageIT, a unique open-source system integrating imaging data management and analysis, and an operational solution for handling large datasets in line with open science requirements (Fig. 1a). In BioImageIT, data are automatically annotated and processed in a single framework. Although the extent of its flexible design remains to be fully exploited, BioImageIT allows the integration of any existing data-management and image-processing software. For instance, data could be hosted on an OMERO4 database using Bio-Formats8 and processed by deep neural networks using TensorFlow9. Unlike previous frameworks (for example, Galaxy Project10), BioImageIT addresses the following end-user issues.

Fig. 1: BioImageIT overview.
figure 1

a, Schematic view of BioImageIT architecture. The BioImageIT core comprises data-management and data-processing functionalities. Users can access plugins through a script editor, the BioImageIT graphical interface or the Jupyter platform. Data-management functionalities exploit local files, remote files or the OMERO database. Data processing can perform computations in remote jobs, containers or local runners. Image analysis is provided by plugins, which can be written in different programming languages. Developers can implement their own plugins in BioImageIT and design their own graphical interface. bi, Example of lattice light shee microscopy workflow for 3D reconstruction and tracking of a live genome-edited HeLa cell (CD-M6PR-eGFP) stained with Tubulin Tracker Deep Red for Microtubules11. b, Because of the geometry of lattice light sheet microscopy (LLSM) scanning, raw 3D images are skewed. c,g, First, realignment (deskew) of raw stacks is performed using the software package pycudadecon. d,h, Richardson–Lucy deconvolution is performed using pycudadecon. e, CD-M6PR-eGFP vesicles are tracked11 using Trackmate. f,i, Deconvolved stacks and tracks are rendered using napari. The full workflow, including pycudadecon, Trackmate and napari, was gathered in BioImageIT.

BioImageIT reconciles data management with analysis in a common interactive framework

Most open-source bioimaging software is developed separately and specialized for either data management or data analysis. Therefore, users must write ad hoc scripts or apply manual operations to process the data. In contrast, BioImageIT allows import and tagging of data. Each operation automatically generates metadata that facilitate keeping track of analysis steps and are compliant with the recommended FAIR principles7.

BioImageIT is interoperable and reusable

Processing tools are pre-packaged and stored in public repositories. Thus, users can reuse them and create data analysis workflows with software developed in any language.

BioImageIT is developer friendly

BioImageIT makes it easy for data scientists to distribute new tools embedded in a package. Only a basic configuration file (wrapper) is required for identification in BioImageIT.

BioImageIT is user focused

BioImageIT consists of three layers: back-end plugins, a python application programming interface (API) and a graphical user interface (GUI). Users can choose the most appropriate level of interaction. Biologists may prefer to be assisted step by step, in which case a GUI is appropriate. Data analysts familiar with writing scripts may use the python API. Data scientists can adopt the packaging back-end to provide a stand-alone demonstration of their new processing tool.

In summary, BioImageIT is a generic framework for managing, analyzing and ensuring traceability except at this stage for patient-sensitive data. Unlike previous platforms, it addresses the needs of end users and provides a flexible solution to link annotated data with processing tools. It facilitates interactions between experimental and data scientists. Because BioImageIT is built upon existing technologies, it may be considered as a computational overlay providing a user-friendly interface for existing software to end users, without hindering the addition of new analytical methods by experienced developers. Finally, the ability of BioImageIT to integrate with data interaction tools allows highly specialized examples to be deployed for a specific application domain, as illustrated here for lattice light sheet microscopy data processing (Fig. 1b–i; see other case studies in Supplementary Figs. 1 and 2). These examples demonstrate how BioImageIT creates sophisticated image-analysis workflows for datasets obtained using advanced microscopic techniques. BioImageIT is being deployed on ten imaging platforms covering a broad perimeter of the France-BioImaging national infrastructure (https://bioimageit.github.io/#/about). With some of them, we are currently developing a BioImageIT python API to run analysis pipelines from their own database GUI. To facilitate or generalize this use, including with OMERO4, a Java or a REST API will be developed. With this ongoing phase, using imaging core facilities as pillars, our primary goal is to build a BioImageIT community.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.