Abstract
Space biology research aims to understand fundamental spaceflight effects on organisms, develop foundational knowledge to support deep space exploration and, ultimately, bioengineer spacecraft and habitats to stabilize the ecosystem of plants, crops, microbes, animals and humans for sustained multi-planetary life. To advance these aims, the field leverages experiments, platforms, data and model organisms from both spaceborne and ground-analogue studies. As research is extended beyond low Earth orbit, experiments and platforms must be maximally automated, light, agile and intelligent to accelerate knowledge discovery. Here we present a summary of decadal recommendations from a workshop organized by the National Aeronautics and Space Administration on artificial intelligence, machine learning and modelling applications that offer solutions to these space biology challenges. The integration of artificial intelligence into the field of space biology will deepen the biological understanding of spaceflight effects, facilitate predictive modelling and analytics, support maximally automated and reproducible experiments, and efficiently manage spaceborne data and metadata, ultimately to enable life to thrive in deep space.
Main
Space biology research focuses on answering fundamental mechanistic questions about how molecular, cellular, tissue and whole organismal life responds to the space environment. Biological stressors of spaceflight include ionizing radiation, altered gravitational fields, accelerated day–night cycles, confined isolation, hostile closed environments, distance–duration from Earth1, planetary dust regolith2,3, and extreme temperatures and atmospheres4,5. Moreover, spaceflight stressors are probably compounded and amplified with increasing time in space and distance from Earth1,6. Understanding, predicting and mitigating these changes at all levels of biology is increasingly important, given the deep space exploration goals of the National Aeronautics and Space Administration (NASA) towards cis-lunar and Mars missions. Ultimately, the goal of space biology research is to extend beyond an understanding of how extraterrestrial conditions affect life, to enable bioengineered solutions for sustained life on the Moon, Mars and during deep space missions beyond low Earth orbit (LEO)7.
In this Review, we present findings from the ‘Workshop on Artificial Intelligence and Modeling for Space Biology’ organized by NASA in June 2021, which sought to map the roles of artificial intelligence (AI), machine learning (ML) and biological computational modelling in the field of space biology research over the next decade8. On the basis of mathematical principles and computer science, AI and ML methodology trains algorithms to predict outcomes and probabilities of interest9,10,11. A parallel review article reviews the workshop participants’ recommendations regarding the roles of AI and ML for astronaut, ecosystem and precision space health12.
Workshop participants highlighted three main near-term focus areas, which will be discussed in the following sections. First, fundamentally transformative approaches leveraging AI and ML will be needed to automate biology experiments in settings beyond LEO. These approaches must facilitate the generation and analysis of reproducible datasets that incorporate multiple types of measurement to achieve a comprehensive characterization of organismal responses to a variety of extraterrestrial conditions. Such datasets can then be used for robust predictive modelling of spaceflight responses at every biological level. Second, discussion centred on the need for data management standards to ensure AI readiness and open-source data availability and organization. Workshop participants emphasized the importance of supporting an open-science culture and approach in space biology, which aims to promote transparency, inclusivity, data sharing and data access for reproducibility13, as well as ensuring FAIR data management practices14,15 (that is, findable, accessible, interoperable and reusable). Finally, workshop participants agreed on a set of existing AI and ML methods with tremendous promise for space biology applications, as well as near-term development approaches for novel AI and ML methods designed specifically for space biology challenges.
On the basis of the workshop discussions, this report proposes a widespread implementation of AI and ML methods at every level of space biology research (from ground to spaceborne research). This effort has the potential to revolutionize the breadth and depth of our knowledge in two central ways: (1) self-driving labs to enable efficient, automated and maximally autonomous experimentation and data collection in space research environments; and (2) by assisting management, analysis, modelling and interpretation of current and future space biology datasets.
Space biology research data
Space biology research leverages spaceflown and ground-analogue experiments using model organisms to understand space impacts on increasingly complex life. Experimental models include unicellular organisms (for example, prokaryotic, eukaryotic, yeast, fungi), tissue-on-a-chip models16, invertebrates (for example, Drosophila melanogaster, Caenorhabditis elegans, tardigrades), simple model plants (for example, Arabidopsis thaliana), vertebrates (for example, mice, rats, fish), and crops and edible plants1,16,17. Model organism research is key to translational science, with the resulting evidence influencing the direction of human health research and driving the design of life support systems18.
At the molecular and cellular levels, space biology experiments seek to characterize all possible spaceflight-induced changes in cell morphology19, development and differentiation20, protein regulation21, epigenetic processes22, and gene expression23, among others24. Organ-level modelling systems such as tissue-on-a-chip models are used to study shifts in cellular organization and communication25,26. The current understanding of biological responses to spaceflight incorporates experimental evidence from a variety of data types along hierarchical biological levels, from molecular to single cell to whole organism (Fig. 1). At the cellular level in both humans and rodents, six fundamental responses to spaceflight have been well characterized, including increased oxidative stress, DNA damage, mitochondrial dysregulation, telomere length, and epigenetic, metabolic and microbiome changes1. These responses have been linked to a variety of physiological effects, including cardiovascular dysregulation, central nervous system impairments, bone loss and immune dysfunction1.
The majority of current space biology knowledge originated through ground-analogue experiments27,28, satellites in LEO17, and from experiments on the Space Shuttle29 or the International Space Station (ISS)30,31. To facilitate NASA’s goals of human exploration beyond LEO to the Moon and Mars, space biological research is now focusing on characterizing the risks for deep space travel for mammalian, plant and microbial life. For example, the successful Artemis 1 launch in 2022 saw the deployment of NASA’s BioSentinel experiment, which sent yeast cells to heliocentric orbit in an automated microfluidic culturing device aboard a CubeSat to measure the effects of deep space radiation32,33. Experimental platforms such as BioSentinel that are sent beyond LEO must be robust to several limitations, including long transit times, extreme environmental conditions, limited crew availability and limited sample return. Small experimental platforms such as CubeSats must also have the capability to generate their own power and control their internal environment for temperature, carbon dioxide levels and so on. Conducting biological research beyond LEO will require advanced technological design not fully developed yet, which will be resilient to space conditions and have limited communication with Earth. Such technology needs to enable partly or fully automated experiments, together with continuous environmental monitoring, and in situ data processing and analysis. Although aspects of these capabilities exist on Earth, there are major technology gaps that must be resolved before routine experimentation with relevant biological models can take place beyond LEO. Due to the necessarily automated and in situ nature of future biological experimentation beyond LEO, it follows that AI and ML will have essential roles in enabling these platforms.
Automated experiments in space supported by AI
Space biology research and data analysis have benefited from innovations in increasingly efficient and sensitive research technologies34,35. In the broader field of biological research, next-generation sequencing platforms36, big data frameworks and computational libraries for data storage, processing and analysis have led to the ability to conduct groundbreaking clinical studies with multi-omics data collected across thousands of samples37,38. Recent innovations in technologies such as single-cell sequencing, exosome sequencing, cell-free nucleic acid sequencing, spatial transcriptomics39,40 and nanopore sequencing41 have greatly broadened the potential for longitudinal characterization of cellular and genomic dynamics39,40,41,42,43,44,45. Findings derived from spaceflown experiments leveraging these technologies include ocular/retinal alterations23,46, liver dysfunction47,48, microRNA signatures49, mitochondrial stress50, gut microbiome alterations51 and alternative splicing in space-grown plants52.
However, it is difficult to leverage these technologies to their full potential in space, where workforce and resources are extremely limited. Most experimentation in space is expensive, time-consuming and not automated. This results in small experiments with few samples and replicates, and high levels of variability due to differing sample-handling procedures53,54,55. This makes AI and ML analysis of current space biological data difficult, as the models become under-determined and overfitted to the training data due to high dimensionality (tens of thousands of variables compared with tens or hundreds of data points), and technical batch effects make it challenging to combine datasets to gain higher sample numbers.
Workshop participants agreed that a comprehensive effort to streamline and automate biological experimentation in space is needed to generate large-scale, high-quality, AI-ready, reproducible datasets required to meaningfully expand and validate our scientific understanding and knowledge base. Here we define ‘AI-ready’ to mean a dataset that can be used to train an AI and ML model without further preprocessing except that which may be uniquely required for model architecture.
Current terrestrial automated science
On Earth, basic molecular biology tasks such as pipetting, sequencing library preparation, cell culture maintenance, microscopy, quantitative phenotyping and behavioural change detection have already been automated in a variety of platforms56,57,58,59,60,61. Biofoundries apply high-throughput laboratory automation to generate thousands of strain constructs and DNA assemblies per week62. These advances now enable robust technical reproducibility across experiments, allowing researchers to isolate only the effects of relevant biological independent variables. However, these platforms still require a great deal of personnel operation and hands-on time. Ideally, a fully automated experimental system for spaceborne research will integrate multiple robotic functions (for example, pipetting + cell culture + microscopy photo capture and analysis + phenotyping + cell lysis and nucleic acid isolation + library preparation + sequencing + data analysis). The only human input required should be the initial set-up of experimental parameters and the command to begin experimentation, and system-requested input when unexpected experimental outcomes are observed. The new domain of cloud laboratories for automated science, such as Emerald Cloud Lab, provides facilities to researchers who design and run experiments through an application programming interface61,63,64.
Current and potential spaceflight automated science
At present, there is limited automation for biological data collection and analysis in spaceflight although progress has been made. This is particularly seen in automating spaceborne biological image acquisition. For example, a real-time multi-fluorescence cell culture microscope was established on the ISS65, and Arabidopsis response to microgravity was live-imaged by confocal microscopy66. A recent deep learning approach for automated cell segmentation based on crowdsourced annotation libraries could be leveraged to greatly expedite in situ deep space knowledge discovery67. One possibility for AI and ML and automation in space would be to move from the current, manual analysis of ISS rodent behavioural video68, to an ML-based analysis of ambulatory, sensorimotor and behavioural spaceflight effects69,70,71,72. Another possibility would be to leverage natural language processing with vision-transformer models to develop platforms for automatic, real-time image descriptions and labelling73,74.
Another area of expanding automated AI and ML data capture analysis in spaceflight is ocular/retinal imaging. IDx-DR, an AI-enabled analysis platform for detection of diabetic retinopathy in retinal images, is a Food and Drug Administration-approved AI-based method75. This indicates potential feasibility of AI-based methods to detect space-related pathologies such as spaceflight-associated neuro-ocular syndrome (SANS; a high-priority ocular/visual risk for long-duration microgravity missions76,77). At present, informative changes in vascular branching of the retina and other tissues can be mapped and quantified by NASA’s AI-enhanced Vessel Generation Analysis (VESGEN) software77. Real-time detection of experimental results and pathologies in spaceflight could be enabled by full integration of VESGEN with computer-supported ophthalmic ocular coherence tomography (OCT) and OCT-angiography (OCT-A), which have recently been updated on the ISS for monitoring SANS78 (a technology increasingly miniaturized79 and AI-integrated). Fundoscopy, OCT and OCT-A are now available for real-time, longitudinal imaging of small animals80, which would greatly expand experimental capabilities81.
Recent years have seen successful sequencing of nucleic acids aboard the ISS, facilitated by a long-read sequencer (Oxford Nanopore Technologies)41,82,83,84. Predictably, testing and adjustment were required for the sample loading and sequencing procedure due to the effects of microgravity on liquid dynamics83, illustrating the investment required to automate complete experimental procedures in space, but providing a powerful example for transitioning state-of-the-art research capabilities to space.
Self-driving labs
Automated science in space should be aimed at enabling partially or fully autonomous deep-spaceflight-ready ‘self-driving labs’85,86 that employ AI and ML in a closed-loop system to produce new knowledge and optimize experimental design based on data collected in previous experiments (Fig. 2). In a closed-loop self-driving lab, the AI and ML system has the capability to choose the hypothesis to be tested and the parameters for the next experiment87. In the past decade, advances in several research areas have made such self-driving labs possible on Earth88,89,90,91,92,93. We now have the ability to automate many biological processes using state-of-the-art microfluidics chips for optics, imaging and robotics94,95,96,97,98.
In spaceflown research programmes, implementation of self-driving labs will aid comprehensive characterization of the effects of spaceflight on living systems, ultimately feeding research findings into applications such as in situ analytics, Earth-based open-science research programmes and precision astronaut health systems.
A central goal of developing and employing autonomous, AI-supported bioexperimentation systems such as self-driving laboratories should be to generate data that can inform autonomous precision space health systems that provide decision support for crew health management during LEO, cis-lunar and Mars missions12. As automated experimentation becomes more widely available, the space biology field should shift to conducting longitudinal studies99, characterizing physiological changes over the duration of an entire mission. These longitudinal data will help identify biomarkers from various physiological, molecular and microbial systems that can be integrated to create individualized baseline models for humans and other organisms. Monitoring in-flight changes to these biomarker signals will help predict and prevent adverse organismal health outcomes, and predict how different organisms will react to spaceflight conditions.
Data standards and management
A large portion of the workshop discussion centred on the importance of establishing data standards and increasing support for data management to generate maximally AI-ready datasets in space biology research.
Data management for AI readiness
Raw biological data can be complex, sparse and heterogeneous, and therefore not typically ready for AI and ML applications. Biological measurements relevant to a single scientific question may be discrete or continuous, qualitative or quantitative, single- or multi-dimensional, incomplete, highly descriptive (for example, the appearance of cells in culture), and unstructured (particularly for phenotypic and behavioural data). Different experimental practices between facilities and researchers manifest as biases in the data, complicating integration of data from various experiments into a unified platform. These issues are amplified in the space biology field for several reasons. Each spaceflight experiment is conducted by an astronaut, while the ground control studies are conducted by Earth-based researchers, introducing a notable source of variability. Further, biological datasets from different missions have environmental variables (for example, duration, temperature, radiation, carbon dioxide) associated with them that differ across missions and need to be integrated with biological-results data. For space biological data to become AI-ready, we need harmonization standards that incorporate space-specific data and all metadata.
For space biology and health, the NASA GeneLab repository provides open-source, uniformly processed multi-omics data from spaceflight and ground-analogue studies, making space biology multi-omics data as AI-ready as possible100,101. The success of GeneLab in managing its data and metadata led to more efficient collection, curation and management of other spaceflight-relevant data (phenotypic, physiological, biospecimens, environmental telemetry, imaging, microscopy, behavioural; tabular, imaging, video)102 now all part of a unified NASA Open Science Data Repository. Additional work is needed to establish widely adopted standards for AI readiness in these research domains103,104. To best leverage all available data, space biology needs to invest in tools to perform automated conversion from existing, non-AI-ready formats into AI-ready formats. To facilitate this, the community needs a set of standardized ontologies and data formatting guidelines specifically for space biology (for example, the inclusion of a datasheet to describe each dataset105). These standards can then inform experimental design to ensure that data from future missions are generated in an AI-ready format.
A key part of data standards is the establishment of uniformly used vocabularies that are grounded in common conceptualizations (that is, ontologies), which increase data discovery and reuse. Biomedical ontologies have existed for over 50 years and many are in widespread use106,107,108, but no single ontology includes foundational concepts in the space biology/aerospace medicine domain (for example, specially developed experimentation hardware types, space environment types, parameters and so on). The space biology community should focus efforts on developing one or more such ontologies to standardize metadata with respect to space-relevant concepts and data structures, and across microbial, plant, animal and humans. An early effort is the Radiation Biology Ontology produced by NASA GeneLab and STOREDB109.
Automated, AI-assisted data harmonization and dataset curation will be a critical part of advanced space biology research architectures like the one shown in Fig. 3. Such architectures must be designed to support the entire experimental process from investigation management, to experiment execution, to data publication, through to open-science13 data repository submission (with appropriate security and governance measures to guarantee protection of private data resources). Investigator data can be effectively integrated into the NASA Open Science Data Repository through embedded digital experiment notebooks to preserve experimental parameters and analyses, with link-out capability to approved, external data resources for seamless integration with research data110,111. Use of space biology metadata ontologies can support automated harmonization across the wide spectrum of organisms studied, equipment used and experimental designs. Because space biology datasets cover a range of modalities (each requiring distinct data processing), such advanced architectures must include a suite of metadata, data acquisition and data processing tools. The proposed environment is similar to the successful virtual observatory paradigm in the planetary sciences112,113. Moreover, effective methodology transfer from planetary science to biomedical research has already occurred between NASA’s Jet Propulsion Laboratory and the National Cancer Institute’s Early Detection Research Network114.
The diagram shows the data and information flow in which a cloud-based data management environment serves as the nexus between space-based data and research and Earth-based researchers and analysts, enabling open-science access to data and analytics and facilitating preparation of AI-ready datasets.
Full utilization of such an environment would ensure that all newly generated space biology datasets are AI-ready, and facilitate conversion of previously generated datasets into AI-ready formats. In addition, embedded open-science capabilities will enable broad data sharing and reuse, and avoid metadata decay and long-term data maintenance issues. A similar data management tool was implemented recently by the National Institute of Standards and Technology, to address data harmonization and standards for their principal investigator and research community115. A unique aspect of the space biology data management system is that ultimately, such an architecture must be cloud-based and linked to in-flight data acquisition systems, and eventually deep space communication for critical data downlinks116,117.
Finally, it will be important to establish and adopt robust dataset readiness metrics to aid AI-modelling researchers in understanding the applicability of various datasets. Technology readiness levels have already been proposed for ML methods118. Such metrics, if applied to datasets, could be useful for understanding the AI readiness of space biological data. Moreover, a bronze, silver and gold reproducibility standard has been proposed for life science AI and ML workflows119. A similar standard could be implemented for AI and ML analysis of space biological data to ensure reproducibility and confidence in results. These standards would be tailored for different AI and ML methods.
Organizing data
Standardized space biology ontologies will enhance the opportunity to construct knowledge graphs (KGs)120,121 compatible with the unique experimental outcomes of space biology research. These KGs will incorporate and model causal relationships using ontological content and space data, enabling the inference of physiological responses to experimental perturbations from multi-omic, phenotypic, imaging and environmental telemetry data. An existing relevant KG is the National Science Foundation-funded, University of California, San Francisco-developed SPOKE (Scalable Precision Medicine Oriented Knowledge Engine)122, which is linked to about 30 biomedical, chemical, molecular and pharmaceutical databases123,124. Analysis of transcriptomic spaceflown mouse data using SPOKE identified spaceflight-induced physiological changes similar to terrestrial clinical conditions, consistent across multiple tissue types, demonstrating the utility of KG-based systems for furthering our understanding of space biology122. A notable resource for data mining to model causal relationships in a space health context are the various directed acyclic graphs produced and maintained by the Human Systems Risk Board125,126.
AI and ML methods for space research
Space biology combines the complexity of the biological and medical fields with an entirely new dimension: extended spaceflight in environments not truly known, or very different from Earth. Therefore, a portion of the workshop discussion focused on development of AI and ML algorithms specifically designed for data collected in novel space constraints and environments. A parallel problem is limited computational resources in spaceflight, and there is a detailed discussion of recommendations for this problem in the companion review article from this workshop12.
Interpreting biological data
Explainable AI (xAI) provides a human-readable explanation of the evidence and rationale for predictions and recommendations, particularly important in biomedical research127,128,129,130,131. As a central goal of space biological research is to establish predictive characterization of spaceflight effects on human astronaut health through translation from model organisms, all aspects of AI and ML development in this field should embrace and incorporate a degree of xAI practices as well as post hoc explainability and model interpretability with tools such as LIME132 and SHAP133.
Generating new data
Workshop participants recommended the creation of a collection of generative models (model zoo) that have been pre-trained for each of the main types of space biology data. These models, which typically use generative adversarial networks (GANs)134 or variational autoencoder architectures (VAEs)135, can be used to produce synthetic data to validate new and existing space biology AI and ML methods. For example, the ECG Generator of Representative Encoding of Style and Symptoms model generates synthetic electrocardiogram (ECG) signals after training on data from an astronaut wearable device, providing a large dataset of realistic synthetic data on which to train models for astronaut health monitoring136; and GANs have been used to generate synthetic DNA-sequencing137 and RNA-sequencing138 data.
Generative models can also provide powerful solutions for data mapping: generating data based on source data that are often dimensionally smaller than the target. For example, VAEs can translate ECG readings into an activation map that re-creates the electrical activity of the heart139.
Next-generation models
Workshop participants recommended looking beyond established AI and ML techniques and classic deep learning architectures to investigate potential next-generation AI and ML models and related computing hardware. Three opportunities in particular were considered promising because their capabilities could help solve specific challenges of space biology: one-shot learning, advanced transfer learning and alternative hardware architectures. One-shot and transfer learning have potential to help address small sample sizes, or space data collected in different settings than training datasets; and neuromorphic computing can support in situ data collection and analysis.
First, one-shot (or few-shot) learning is a technique for developing an AI and ML model with limited training examples, which is a valuable characteristic for space biosciences due to the sparse availability of biological data gathered in a spaceflight context, especially from astronauts. This technique has been primarily applied to image similarity and classification, which implies that analysis of spaceborne histological data may be particularly well suited to established implementation architectures140,141.
Second, transfer learning aims to mimic the manner in which biological intelligence can leverage expertise in one area to more quickly and effectively learn to tackle problems in an adjacent domain. ML models can be trained using one dataset, and then rapidly adapted to a function in an adjacent problem space using a second dataset. Most commonly, this is done when there is limited or no data available for the target problem space, such as predicting human health risks in a deep space context. The result of such transfer learning, using a large amount data from a related field with subsequent adaptation using a limited amount of data from the actual problem space, can be more effective than attempting to use only the smaller actual dataset, even when used with data boosting techniques142,143.
Third, there are emerging chipsets and hardware architectures that offer promising capabilities for AI and ML applications with minimal SWaP-C characteristics (size, weight, power and cost)144. One particularly important example is neuromorphic computing, which represents a dramatic departure from the classical von Neumann architecture. The term neuromorphic emerged in the late 1980s, and at the time primarily referred to analogue or hybrid analogue–digital implementations of brain-inspired computing, with research that included chips containing miniaturized analogue circuits to represent neural network architectures. However, hardwired analogue chips are impractical for contemporary large-scale neural nets and have limited reconfigurability for the requirements of adaptive systems. For this reason, neuromorphic innovations shifted towards digital solutions, with chips containing many thousands of specially designed cores, each with an arithmetic logic circuit, memory and queue register that can represent one or more neurons. This approach retains the benefits of neuromorphic computing, most notably low power consumption and the biologically inspired nature of spiking neural networks, rather than the clock-synchronized von Neumann architecture145,146. The appeal of neuromorphic computing to AI and ML problems is further amplified in a spaceflight context due to their resilience to ionizing radiation144,147,148. All these benefits are particularly important when in situ spaceborne AI and ML systems must continually learn, as opposed to only performing inferencing with a static model. This need for ‘continual learning’ is anticipated to be a likely scenario for long-duration human space exploration during which human biological systems adapt to spaceflight, thereby establishing a ‘new normal’ from which indications of disease must be detected.
In addition, workshop discussions acknowledged that other advances in AI and ML and computing hardware may soon emerge that are highly applicable to operations in the deep space environment and biosciences domain, such as (1) compute-in-memory solutions based on resistive random-access memory149, (2) advances in the reliability of space-grade graphic processing units and field programmable gate array devices150, (3) self-repairing neural networks, and (4) quantum computing151. In the case of quantum computing, the properties of quantum states, such as superposition and entanglement, are used to represent the entire search space of an optimization problem, which can then be observed to trigger a collapse of the quantum state down to a single eigenstate that defines the optimal result. For certain classes of problems, this can offer computing performance that is many orders of magnitude faster than algorithms that run on binary computing hardware. This is relevant to AI and ML research because models typically ‘learn’ through iterative adjustments in their parameters to minimize a cost function, which makes quantum optimization and AI and ML a promising combination152,153. There are still technology gaps before quantum computing can be fully implemented on spacecraft, but the first successful in-flight training and inference of quantum machine learning for Earth observation was recently completed154.
In the same way that traditional neural networks are inspired by human neural programming, self-repairing neural networks are inspired by organisms such as immortal jellyfish, which continuously regenerate their neural networks in anoxic or very-low-oxygen environments in the deep ocean155. These self-repairing models may be useful for robust training and inference on deep space expeditions, capable of graceful degradation and self-repair even when interrupted by a radiation-induced single-event upset. Some of these model designs focus on emulating the biological self-repair role of astroglial cells, whereas others focus on adding a ‘safe layer’ to the model architecture that imposes self-learned constraints to outputs and can trigger any necessary self-repair156,157.
Predictive systems
The ultimate goal of space biological research is to predict the effects of spaceflight at all physiological levels within diverse living systems, then develop the building blocks to support life and bioengineer the foundations for sustained life beyond Earth. Such predictive modelling and bioengineering will only be possible once we are able to model all parts of living systems, introduce perturbations, and measure genetic, cellular and physiological outcomes longitudinally.
Building on automated, robotic and longitudinal data capture capabilities, workshop participants emphasized that space biology research will benefit from the development of digital twins: predictive models of whole organisms. Digital twins integrate multi-scale mechanistic mathematical modelling of an entire complex organism, from genes to cells to tissues to organs158,159,160. There now exist whole-cell computational models of microbes Mycoplasma genitalium161 and Escherichia coli162 for cellular predictions, and the ongoing Physiome Project develops mathematical models of the human body, from cells to tissues to organs, integrating chemical, metabolic, cellular and anatomical information163,164. Such models could integrate microbial–host cell interactions and environmental coupling data to predict responses to microbial population change or environmental perturbations (which are typical of spaceflight). It will be important to identify an appropriate set of reference organisms (bacteria, eukaryote, archaea, viruses) as targets for high-fidelity digital twin models, which could be used to predict biological response under diverse extraterrestrial environments. Ultimately, this technology will enable the development of predictive models that can be personalized to individual human astronauts based on unique differences in genetics or physiology.
Discussion
Our current understanding of the multi-tiered effects of spaceflight stressors on a mammalian organism is derived from a small amount of human astronaut data and hundreds of small, expensive, model organism biological experiments performed manually during a variety of spaceflight missions (and ground-analogue experiments). Workshop participants agreed that to advance space biology research as a field, a paradigm shift is necessary from the current manual, single-experiment paradigm into a new era of biological research conducted in space facilitated by robotic automation, AI-driven experimental design and analysis. Workshop participants envisioned an AI and ML space biology research lifecycle, with a data management environment and appropriate AI and ML methods facilitating the acceleration of research findings and ultimately powering widespread flight data acquisition and precision support for astronaut and ecosystem health.
Make AI and ML space-ready
Although much of the automation discussed is already in use terrestrially, it is important to note that these hardware and software are not immediately suitable for spaceflight research. Steps must be taken to convert and develop these automated systems for use in-flight, following existing processes for creating space-ready hardware. Many types of scientific equipment have already been cleared for spaceflight and are currently in use on the ISS165, but anticipated future challenges include but are not limited to: (1) known difficulties in microfluidic processing in microgravity83, (2) enabling processes beyond LEO in higher radiation exposure and altered/partial gravity1, and (3) deploying effective edge computing for diverse locations such as the Lunar Gateway, lunar surface, Mars transit and the Martian surface. Spaceflight-ready automated systems will enable cost-effective collection of vast biological data in difficult or constrained conditions. Moreover, workshop participants agreed that the next step is to couple automation with AI-assisted or AI-driven hypothesis generation and experimental design to facilitate the automatic generation of biological insights over time without the need for human input and expertise (self-driving labs). By benefiting from the high reproducibility of machines, we envision a future where automatic data and metadata acquisition will be complete and unambiguous such that AI and ML methods will be able to accumulate such information and constantly learn.
Data standards and fields of research
The future of space biology experimentation envisioned during the workshop will only be possible through widespread adoption and implementation of standards for generating and maintaining data and metadata from automated and AI-driven systems. It will be vitally important to develop a set of guidelines for generating AI-ready, machine-readable data from every space biology experiment, to facilitate open-access AI- and ML-assisted data analysis and reuse. This must include concerted efforts to develop and maintain space biology vocabularies, ontologies and data dictionaries that can be leveraged for automated reasoning, as well as top-down motivation to adopt standardized data management facilities.
Adapting and developing AI and ML methods for space biology
Workshop participants discussed existing AI and ML methods, models and algorithms, and agreed that the next decade of investment must include a focus on adaptation and implementation of existing methods with a specific focus on space biology. Approaches such as transfer learning and generative modelling hold great promise for space biology research, but care must be taken to adapt these methods within spaceflight constraints. Further, novel AI and ML approaches with high potential for space biology were discussed, including neuromorphic computing. Due to limited bandwidth in space, our efforts should focus on developing multi-faceted solutions including pre-training lightweight models on larger Earth-bound datasets, federated training166, edge computing167 and onboard processing. A more in-depth discussion of these solutions is in the accompanying workshop review article12.
In the next decade, investment in AI and ML research design and analysis promises to revolutionize the way biological research is performed in space. Integration of automated and self-training systems will enable hands-off and reproducible generation of substantial cutting-edge imaging, video and multi-omics datasets, ready to be mined by next-generation AI and ML space biology tools. Workshop participants agreed that the key to this future involves the creation of multidisciplinary teams with statisticians, biologists, modelling experts and hardware developers. Such cross-cutting and interdisciplinary teams are able facilitate the experimentation and data analysis necessary to fully understand and begin to predict and mitigate stressors of spaceflight, and enable life to thrive in deep space.
Recommendations and conclusion
The goal of the ‘Workshop on Artificial Intelligence and Modeling for Space Biology’ was to develop a vision for optimal usage of AI and ML in both spaceflight health support and space biological research over the next decade. Workshop participants identified several areas of space biological experimentation that would benefit from AI- and ML-based analysis or automation, and developed a set of fundamental action items for the next ten years. It was challenging to identify precise development strategies, as the current use of AI and ML in space biology is limited. The outcome of the workshop took the form of several broad focus areas rather than a detailed roadmap. These focus areas are as follows.
-
(1)
Ensure all space biological data and information are generated with strong data stewardship standards embracing FAIR and open science to enable public access and scientific reuse
-
(2)
Develop self-driving labs for spaceflight, using data management standards to inform the data output and organization from automated experimental platforms
-
(3)
Adapt relevant existing AI, ML and modelling methods best suited for space biology research implementation (and when necessary lead development of new methods)
Adoption of AI, ML and modelling methods in space biological research is an endeavour that will span the next decade, but will ultimately revolutionize the way we perform experiments and analyse data. The developments discussed in this paper will enable us to gather the necessary data and tools to build a comprehensive characterization of the biological responses of living systems to myriad diverse spaceflight environments. This knowledge base will be essential to facilitate NASA’s goals of lunar, Martian and deep space missions, as we will be able to predict and mitigate adverse effects at all biological levels.
References
Afshinnekoo, E. et al. Fundamental biological features of spaceflight: advancing the field to enable deep-space exploration. Cell 183, 1162–1184 (2020).
Loftus, D. J., Rask, J. C., McCrossin, C. G. & Tranfield, E. M. The chemical reactivity of lunar dust: from toxicity to astrobiology. Earth Moon Planets 107, 95–105 (2010).
Pohlen, M., Carroll, D., Prisk, G. K. & Sawyer, A. J. Overview of lunar dust toxicity risk. NPJ Microgravity 8, 55 (2022).
Paul, A.-L. & Ferl, R. J. The biology of low atmospheric pressure–implications for exploration mission design and advanced life support. Am. Soc. Gravit. Space Biol. 19, 3–17 (2005).
Council, N. R. Recapturing a Future for Space Exploration: Life and Physical Sciences Research for a New Era (National Academies Press, 2011).
Goswami, N. et al. Maximizing information from space data resources: a case for expanding integration across research disciplines. Eur. J. Appl. Physiol. 113, 1645–1654 (2013).
Nangle, S. N. et al. The case for biotech on Mars. Nat. Biotechnol. 38, 401–407 (2020).
Costes, S. V., Sanders, L. M. & Scott, R. T. Workshop on Artificial Intelligence & Modeling for Space Biology. Zenodo https://doi.org/10.5281/zenodo.7508535 (2023).
Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349, 255–260 (2015).
Topol, E. J. Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again (Basic Books, 2019).
Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 44–56 (2019).
Scott, R. T. et al. Biomonitoring and precision health in deep space supported by artificial intelligence. Nat. Mach. Intell. https://doi.org/10.1038/s42256-023-00617-5 (2023).
National Academies of Sciences, Engineering, and Medicine, Policy and Global Affairs, Board on Research Data and Information & Committee on Toward an Open Science Enterprise Open Science by Design: Realizing a Vision for 21st Century Research (National Academies Press, 2018).
Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
Berrios, D. C., Beheshti, A. & Costes, S. V. FAIRness and usability for open-access omics data systems. AMIA Annu. Symp. Proc. 2018, 232–241 (2018).
Low, L. A. & Giulianotti, M. A. Tissue chips in space: modeling human diseases in microgravity. Pharm. Res. 37, 8 (2019).
Ronca, A. E., Souza, K. A. & Mains, R. C. (eds) Translational Cell and Animal Research in Space: 1965–2011 NASA Special Publication NASA/SP-2015-625 (NASA Ames Research Center, 2016).
Alwood, J. S. et al. From the bench to exploration medicine: NASA life sciences translational research for human exploration and habitation missions. NPJ Microgravity 3, 5 (2017).
Schatten, H., Lewis, M. L. & Chakrabarti, A. Spaceflight and clinorotation cause cytoskeleton and mitochondria changes and increases in apoptosis in cultured cells. Acta Astronaut. 49, 399–418 (2001).
Shi, L. et al. Spaceflight and simulated microgravity suppresses macrophage development via altered RAS/ERK/NFκB and metabolic pathways. Cell. Mol. Immunol. 18, 1489–1502 (2021).
Ferl, R. J., Koh, J., Denison, F. & Paul, A.-L. Spaceflight induces specific alterations in the proteomes of Arabidopsis. Astrobiology 15, 32–56 (2015).
Ou, X. et al. Spaceflight induces both transient and heritable alterations in DNA methylation and gene expression in rice (Oryza sativa L.). Mutat. Res. 662, 44–53 (2009).
Overbey, E. G. et al. Spaceflight influences gene expression, photoreceptor integrity, and oxidative stress-related damage in the murine retina. Sci. Rep. 9, 13304 (2019).
Clément, G. & Slenzka, K. Fundamentals of Space Biology: Research on Cells, Animals, and Plants in Space (Springer Science & Business Media, 2006).
Yeung, C. K. et al. Tissue chips in space-challenges and opportunities. Clin. Transl. Sci. 13, 8–10 (2020).
Low, L. A., Mummery, C., Berridge, B. R., Austin, C. P. & Tagle, D. A. Organs-on-chips: into the next decade. Nat. Rev. Drug Discov. 20, 345–361 (2021).
Globus, R. K. & Morey-Holton, E. Hindlimb unloading: rodent analog for microgravity. J. Appl. Physiol. 120, 1196–1206 (2016).
Simonsen, L. C., Slaba, T. C., Guida, P. & Rusek, A. NASA’s first ground-based Galactic cosmic ray simulator: enabling a new era in space radiobiology research. PLoS Biol. 18, e3000669 (2020).
Buckey, J. C. Jr & Homick, J. L. The Neurolab Spacelab Mission: Neuroscience Research in Space: Results from the STS-90, Neurolab Spacelab Mission. NASA Technical Reports Server (NASA, 2003).
Diallo, O. N. et al. Impact of the International Space Station Research Results. NASA Technical Reports Server (NASA, 2019).
Vandenbrink, J. P. & Kiss, J. Z. Space, the final frontier: a critical review of recent experiments performed in microgravity. Plant Sci. 243, 115–119 (2016).
Massaro Tieze, S., Liddell, L. C., Santa Maria, S. R. & Bhattacharya, S. BioSentinel: a biological CubeSat for deep space exploration. Astrobiology https://doi.org/10.1089/ast.2019.2068 (2020).
Ricco, A. J., Maria, S. R. S., Hanel, R. P. & Bhattacharya, S. BioSentinel: a 6U nanosatellite for deep-space biological science. IEEE Aerospace Electron. Syst. Mag. 35, 6–18 (2020).
Chen, Y. et al. Automated ‘cells-to-peptides’ sample preparation workflow for high-throughput, quantitative proteomic assays of microbes. J. Proteome Res. 18, 3752–3761 (2019).
Zampieri, M., Sekar, K., Zamboni, N. & Sauer, U. Frontiers of high-throughput metabolomics. Curr. Opin. Chem. Biol. 36, 15–23 (2017).
Stephens, Z. D. et al. Big data: astronomical or genomical? PLoS Biol. 13, e1002195 (2015).
Tomczak, K., Czerwińska, P. & Wiznerowicz, M. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp. Oncol. 19, A68–A77 (2015).
Lonsdale, J. et al. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Atta, L. & Fan, J. Computational challenges and opportunities in spatially resolved transcriptomic data analysis. Nat. Commun. 12, 5283 (2021).
Marx, V. Method of the year: spatially resolved transcriptomics. Nat. Methods 18, 9–14 (2021).
Deamer, D., Akeson, M. & Branton, D. Three decades of nanopore sequencing. Nat. Biotechnol. 34, 518–524 (2016).
Mardis, E. R. DNA sequencing technologies: 2006–2016. Nat. Protoc. 12, 213–218 (2017).
Stuart, T. & Satija, R. Integrative single-cell analysis. Nat. Rev. Genet. 20, 257–272 (2019).
Asp, M. et al. A spatiotemporal organ-wide gene expression and cell atlas of the developing human heart. Cell 179, 1647–1660.e19 (2019).
Giacomello, S. et al. Spatially resolved transcriptome profiling in model plant species. Nat Plants 3, 17061 (2017).
Mao, X. W. et al. Characterization of mouse ocular response to a 35-day spaceflight mission: evidence of blood-retinal barrier disruption and ocular adaptations. Sci. Rep. 9, 8215 (2019).
Jonscher, K. R. et al. Spaceflight activates lipotoxic pathways in mouse liver. PLoS ONE 11, e0152877 (2016).
Beheshti, A. et al. Multi-omics analysis of multiple missions to space reveal a theme of lipid dysregulation in mouse liver. Sci. Rep. 9, 19195 (2019).
Malkani, S. et al. Circulating miRNA spaceflight signature reveals targets for countermeasure development. Cell Rep. 33, 108448 (2020).
da Silveira, W. A. et al. Comprehensive multi-omics analysis reveals mitochondrial stress as a central biological hub for spaceflight impact. Cell 183, 1185–1201.e20 (2020).
Jiang, P., Green, S. J., Chlipala, G. E., Turek, F. W. & Vitaterna, M. H. Reproducible changes in the gut microbiome suggest a shift in microbial and host metabolism during spaceflight. Microbiome 7, 113 (2019).
Beisel, N. S., Noble, J., Barbazuk, W. B., Paul, A.-L. & Ferl, R. J. Spaceflight-induced alternative splicing during seedling development in Arabidopsis thaliana. NPJ Microgravity 5, 9 (2019).
Polo, S.-H. L. et al. RNAseq analysis of rodent spaceflight experiments is confounded by sample collection techniques. iScience 23, 101733 (2020).
Choi, S., Ray, H. E., Lai, S.-H., Alwood, J. S. & Globus, R. K. Preservation of multiple mammalian tissues to maximize science return from ground based and spaceflight experiments. PLoS ONE 11, e0167391 (2016).
Krishnamurthy, A., Ferl, R. J. & Paul, A.-L. Comparing RNA-seq and microarray gene expression data in two zones of the Arabidopsis root apex relevant to spaceflight. Appl. Plant Sci. 6, e01197 (2018).
Vrana, J. et al. Aquarium: open-source laboratory software for design, execution and data management. Synth. Biol. 6, ysab006 (2021).
Miles, B. & Lee, P. L. Achieving reproducibility and closed-loop automation in biological experimentation with an IoT-enabled lab of the future. SLAS Technol. 23, 432–439 (2018).
Durand, A. et al. A machine learning approach for online automated optimization of super-resolution optical microscopy. Nat. Commun. 9, 5247 (2018).
Hess, J. F. et al. Library preparation for next generation sequencing: a review of automation strategies. Biotechnol. Adv. 41, 107537 (2020).
Gómez-Sjöberg, R., Leyrat, A. A., Pirone, D. M., Chen, C. S. & Quake, S. R. Versatile, fully automated, microfluidic cell culture system. Anal. Chem. 79, 8557–8563 (2007).
Jessop-Fabre, M. M. & Sonnenschein, N. Improving reproducibility in synthetic biology. Front. Bioeng. Biotechnol. 7, 18 (2019).
Hillson, N. et al. Building a global alliance of biofoundries. Nat. Commun. 10, 2040 (2019).
Arnold, C. Cloud labs: where robots do the research. Nature 606, 612–613 (2022).
Segal, M. An operating system for the biology lab. Nature 573, S112–S113 (2019).
Thiel, C. S. et al. Real-time 3D high-resolution microscopy of human cells on the International Space Station. Int. J. Mol. Sci. 20, 2033 (2019).
Ferl, R. J. & Paul, A.-L. The effect of spaceflight on the gravity-sensing auxin gradient of roots: GFP reporter gene microscopy on orbit. NPJ Microgravity 2, 15023 (2016).
Greenwald, N. F. et al. Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning. Nat. Biotechnol. 40, 555–565 (2022).
Ronca, A. E. et al. Behavior of mice aboard the International Space Station. Sci. Rep. 9, 4717 (2019).
Wiltschko, A. B. et al. Mapping sub-second structure in mouse behavior. Neuron 88, 1121–1135 (2015).
Dunn, T. W. et al. Geometric deep learning enables 3D kinematic profiling across species and environments. Nat. Methods 18, 564–573 (2021).
Pereira, T. D. et al. SLEAP: A deep learning system for multi-animal pose tracking. Nat. Methods 19, 486–495 (2022).
Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).
Zhang, P. et al. Multi-scale vision longformer: A new vision transformer for high-resolution image encoding. In Proc. IEEE/CVF Intl. Conf. Computer Vision 2998–3008 (2021).
Chen, Z. et al. Visformer: The vision-friendly transformer. In Proc. IEEE/CVF Intl. Conf. Computer Vision 589–598 (2021).
Savoy, M. IDx-DR for Diabetic Retinopathy Screening. American Family Physician https://www.aafp.org/afp/2020/0301/p307.html (2020).
Vyas, R. J. et al. Decreased vascular patterning in the retinas of astronaut crew members as new measure of ocular damage in spaceflight-associated neuro-ocular syndrome. Invest. Ophthalmol. Vis. Sci. 61, 34 (2020).
Lagatuz, M. et al. Vascular patterning as integrative readout of complex molecular and physiological signaling by VESsel GENeration analysis. J. Vasc. Res. 58, 207–230 (2021).
Lee, A. G. et al. Spaceflight associated neuro-ocular syndrome (SANS) and the neuro-ophthalmologic effects of microgravity: a review and an update. NPJ Microgravity 6, 7 (2020).
Chopra, R., Wagner, S. K. & Keane, P. A. Optical coherence tomography in the 2020s-outside the eye clinic. Eye 35, 236–243 (2021).
Sher, I., Moverman, D., Ketter-Katz, H., Moisseiev, E. & Rotenstreich, Y. In vivo retinal imaging in translational regenerative research. Ann. Transl. Med. 8, 1096 (2020).
Mao, X. W. et al. Impact of spaceflight and artificial gravity on the mouse retina: biochemical and proteomic analysis. Int. J. Mol. Sci. 19, 2546 (2018).
Castro-Wallace, S. L. et al. Nanopore DNA sequencing and genome assembly on the International Space Station. Sci. Rep. 7, 18022 (2017).
McIntyre, A. B. R. et al. Nanopore sequencing in microgravity. NPJ Microgravity 2, 16035 (2016).
Stahl-Rommel, S. et al. Real-time culture-independent microbial profiling onboard the International Space Station using nanopore sequencing. Genes 12, 106 (2021).
Häse, F., Roch, L. M. & Aspuru-Guzik, A. Next-generation experimentation with self-driving laboratories. Trends Chem. 1, 282–291 (2019).
Garcia Martin, H. et al. Perspectives for self-driving labs in synthetic biology. Curr. Opin. Biotech. 79, 102881 (2023).
Borkowski, O. et al. Large scale active-learning-guided exploration for in vitro protein production optimization. Nat. Commun. 11, 1872 (2020).
Kitano, H. Nobel Turing Challenge: creating the engine for scientific discovery. NPJ Syst. Biol. Appl. 7, 29 (2021).
Christensen, M. et al. Data-science driven autonomous process optimization. Commun. Chem. 4, 112 (2021).
Granda, J. M., Donina, L., Dragone, V., Long, D.-L. & Cronin, L. Controlling an organic synthesis robot with machine learning to search for new reactivity. Nature 559, 377–381 (2018).
MacLeod, B. P. et al. Self-driving laboratory for accelerated discovery of thin-film materials. Sci. Adv. 6, eaaz8867 (2020).
Kusne, A. G. et al. On-the-fly closed-loop materials discovery via Bayesian active learning. Nat. Commun. 11, 5966 (2020).
Carbonell, P., Radivojevic, T. & Martín, H. G. Opportunities at the intersection of synthetic biology, machine learning, and automation. ACS Synth. Biol. 8, 1474–1477 (2019).
Shih, S. C. C. et al. A versatile microfluidic device for automating synthetic biology. ACS Synth. Biol. 4, 1151–1164 (2015).
Shih, S. C. C. et al. A droplet-to-digital (D2D) microfluidic device for single cell assays. Lab Chip 15, 225–236 (2015).
Iwai, K. et al. Scalable and automated CRISPR-based strain engineering using droplet microfluidics. Microsys. Nanoeng. 8, 31 (2022).
Markin, C. J. et al. Revealing enzyme functional architecture via high-throughput microfluidic enzyme kinetics. Science 373, eabf8761 (2021).
Crane, M. M., Chung, K., Stirman, J. & Lu, H. Microfluidics-enabled phenotyping, imaging, and screening of multicellular organisms. Lab Chip 10, 1509–1517 (2010).
Nakai, M. & Ke, W. Review of the methods for handling missing data in longitudinal data analysis. Int. J. Math. Analysis 5, 1–13 (2011).
Ray, S. et al. GeneLab: omics database for spaceflight experiments. Bioinformatics 35, 1753–1759 (2019).
Berrios, D. C., Galazka, J., Grigorev, K., Gebre, S. & Costes, S. V. NASA GeneLab: interfaces for the exploration of space omics data. Nucleic Acids Res. 49, D1515–D1522 (2021).
Scott, R. T. et al. Advancing the integration of biosciences data sharing to further enable space exploration. Cell Rep. 33, 108441 (2020).
Sanders, L. M. & Costes, S. V. NASA Science Mission Directorate Artificial Intelligence Workshop Report: Standards for AI readiness. National Aeronautics and Space Administration 22–29 (NASA, 2021).
Haibe-Kains, B. et al. Transparency and reproducibility in artificial intelligence. Nature 586, E14–E16 (2020).
Gebru, T. et al. Datasheets for datasets. Comms. ACM 64, 86–92 (2021).
Ong, E. et al. Ontobee: a linked ontology data server to support ontology term dereferencing, linkage, query and integration. Nucleic Acids Res. 45, D347–D352 (2017).
Noy, N. F. et al. BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Res. 37, W170–W173 (2009).
Köhler, S. et al. The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data. Nucleic Acids Res. 42, D966–D974 (2014).
Radiation biology ontology. BioPortal https://bioportal.bioontology.org/ontologies/RBO (2022).
Kwok, R. How to pick an electronic laboratory notebook. Nature 560, 269–270 (2018).
Kanza, S. et al. Electronic lab notebooks: can they replace paper? J. Cheminform. 9, 31 (2017).
Erard, S. et al. VESPA: a community-driven Virtual Observatory in Planetary Science. Planetary and Space Science 150, 65–85 (2018).
Zaslavsky, I. et al. EarthCube Data Discovery Hub: enhancing, curating and finding data across multiple geoscience data sources. American Geophysical Union, Fall Meeting 2017 Abstract IN21B-0049 (American Geophysical Union, 2017).
Crichton, D. J. et al. Cancer biomarkers and big data: a planetary science approach. Cancer Cell 38, 757–760 (2020).
Greene, G., Plante, R. & Hanisch, R. Building open access to research (OAR) data infrastructure at NIST. Data Sci. J. 18, 10.5334/dsj-2019-030 (2019).
McGregor, C. A platform for real-time space health analytics as a service utilizing space data relays. In 2021 IEEE Aerospace Conference (50100) 1–14 (IEEE, 2021).
McGregor, C. A platform for real-time online health analytics during spaceflight. In 2013 IEEE Aerospace Conference 1–8 (IEEE, 2013).
Lavin, A. et al. Technology readiness levels for machine learning systems. Nat. Commun. 13, 6039 (2022).
Heil, B. J. et al. Reproducibility standards for machine learning in the life sciences. Nat. Methods 18, 1132–1135 (2021).
Mohamed, S. K., Nounu, A. & Nováček, V. Biological applications of knowledge graph embedding models. Brief. Bioinform. 22, 1679–1693 (2021).
Ehrlinger, L. & Wöß, W. Towards a definition of knowledge graphs. SEMANTICS 2016 Posters and Demos Track 1–4 (SEMANTICS, 2016).
Nelson, C. A. et al. Knowledge network embedding of transcriptomic data from spaceflown mice uncovers signs and symptoms associated with terrestrial diseases. Life 11, 42 (2021).
Nelson, C. A., Butte, A. J. & Baranzini, S. E. Integrating biomedical research and electronic health records to create knowledge-based biologically meaningful machine-readable embeddings. Nat. Commun. 10, 3045 (2019).
Nelson, C. A., Bove, R., Butte, A. J. & Baranzini, S. E. Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis. J. Am. Med. Inform. Assoc. 29, 424–434 (2021).
Antonsen, E. L. et al. Directed acyclic graph guidance documentation. NASA Technical Reports Server (NASA, 2022).
Reynolds, R. J. et al. Validating causal diagrams of human health risks for spaceflight: an example using bone data from rodents. Biomedicines 10, 2187 (2022).
Pawar, U., O’Shea, D., Rea, S. & O’Reilly, R. Incorporating explainable artificial intelligence (XAI) to aid the understanding of machine learning in the healthcare domain. in Proc. Artif. Intell. Cogn. Sci. 169–180 (2020).
Adadi, A. & Berrada, M. in Embedded Systems and Artificial Intelligence (ed. Ditzinger, T.) 327–337 (Springer Singapore, 2020).
Yang, G., Ye, Q. & Xia, J. Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: a mini-review, two showcases and beyond. Inf. Fusion 77, 29–52 (2022).
Rajabi, E. & Etminani, K. Towards a knowledge graph-based explainable decision support system in healthcare. Stud. Health Technol. Inform. 281, 502–503 (2021).
Covert, I., Lundberg, S. & Lee, S.-I. Explaining by removing: a unified framework for model explanation. J. Mach. Learn. Res. 22, 1–90 (2021).
Ribeiro, M. T., Singh, S. & Guestrin, C. ‘Why should I trust you?’: Explaining the predictions of any classifier. Preprint at https://arxiv.org/abs/1602.04938 (2016).
Lundberg, S. & Lee, S.-I. A unified approach to interpreting model predictions. Preprint at https://arxiv.org/abs/1705.07874 (2017).
Goodfellow, I. et al. Generative adversarial networks. Commun. ACM 63, 139–144 (2020).
Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. Preprint at https://arxiv.org/abs/1312.6114 (2013).
Antoniadou, E. et al. NASA frontier development lab technical memorandum: harnessing AI to support medical care in space. Frontier Development Lab (Frontier Development Lab, 2019).
Gupta, A. & Zou, J. Feedback GAN for DNA optimizes protein functions. Nat. Mach. Intell. 1, 105–111 (2019).
Viñas, R., Andrés-Terré, H., Liò, P. & Bryson, K. Adversarial generation of gene expression data. Bioinformatics 38, 730-737 (2021).
Ghimire, S. et al. Generative modeling and inverse imaging of cardiac transmembrane potential. In Medical Image Computing and Computer Assisted Intervention—MICCAI 2018 508–516 (Springer, 2018).
Shakeri, F. et al. FHIST: a benchmark for few-shot classification of histological images. Preprint at https://arxiv.org/abs/2206.00092 (2022).
Yang, J., Chen, H., Yan, J., Chen, X. & Yao, J. Towards better understanding and better generalization of few-shot classification in histology images with contrastive learning. Preprint at https://arxiv.org/abs/2202.09059 (2022).
Ravishankar, H. et al. in Deep Learning and Data Labeling for Medical Applications (eds. Carneiro, G. et al.) 188–196 (Springer, 2016).
Altaf, F., Islam, S. M. S. & Janjua, N. K. A novel augmented deep transfer learning for classification of COVID-19 and other thoracic diseases from X-rays. Neural Comput. Appl. 33, 14037–14048 (2021).
Bersuker, G., Mason, M. & Jones, K. L. Neuromorphic computing: the potential for high-performance processing in space. Center for Space Policy and Strategy https://csps.aerospace.org/sites/default/files/2021-08/Bersuker_NeuromorphicComputing_12132018.pdf (The Aerospace Corporation, 2018).
Davies, M. et al. Loihi: a neuromorphic manycore processor with on-chip learning. IEEE Micro 38, 82–99 (2018).
Göltz, J. et al. Fast and energy-efficient neuromorphic deep learning with first-spike times. Nat. Mach. Intell. 3, 823–835 (2021).
Dahl, S. G., Ivans, R. C. & Cantley, K. D. Learning behavior of memristor-based neuromorphic circuits in the presence of radiation. Proc. Intl. Conf. Neuromorphic Syst. 53–56 (ACM, 2019).
Yanguas-Gil, A. et al. Neuromorphic architectures for edge computing under extreme environments. 2021 IEEE Space Computing Conference (SCC) 39–45 (2021).
Wan, W. et al. A compute-in-memory chip based on resistive random-access memory. Nature 608, 504–512 (2022).
Furano, G. et al. Towards the use of artificial intelligence on the edge in space systems: challenges and opportunities. IEEE Aerospace Electron Syst. Mag. 35, 44–56 (2020).
Caro, M. C. et al. Generalization in quantum machine learning from few training data. Nat. Commun. 13, 4919 (2022).
Huang, H.-Y. et al. Power of data in quantum machine learning. Nat. Commun. 12, 2631 (2021).
Farhi, E. & Neven, H. Classification with quantum neural networks on near term processors. Preprint at https://arxiv.org/abs/1802.06002 (2018).
Quenelle, N. NASA TechLeap Prize winner tests quantum earth observation system. NASA Feature https://www.nasa.gov/feature/nasa-techleap-prize-winner-tests-quantum-earth-observation-system (NASA, 2022).
Hammarlund, E. U. Harnessing hypoxia as an evolutionary driver of complex multicellularity. Interface Focus 10, 20190101 (2020).
Liu, J., Harkin, J., Maguire, L. P., McDaid, L. J. & Wade, J. J. SPANNER: a self-repairing spiking neural network hardware architecture. IEEE Trans. Neural Netw. Learn. Syst. 29, 1287–1300 (2018).
Leino, K. et al. Self-correcting neural networks for safe classification. Preprint at https://arxiv.org/abs/2107.11445 (2021).
Ruiz, C., Zitnik, M. & Leskovec, J. Identification of disease treatment mechanisms through the multiscale interactome. Nat. Commun. 12, 1796 (2021).
Schaffer, L. V. & Ideker, T. Mapping the multiscale structure of biological systems. Cell Syst. 12, 622–635 (2021).
Yu, M. K. et al. Translation of genotype to phenotype by a hierarchy of cell subsystems. Cell Syst. 2, 77–88 (2016).
Karr, J. R. et al. A whole-cell computational model predicts phenotype from genotype. Cell 150, 389–401 (2012).
Macklin, D. N. et al. Simultaneous cross-evaluation of heterogeneous E. coli datasets via mechanistic simulation. Science 369, eaav3751 (2020).
Hunter, P. J. & Borg, T. K. Integration from proteins to organs: the Physiome Project. Nat. Rev. Mol. Cell Biol. 4, 237–243 (2003).
Fink, M. et al. Cardiac cell modelling: observations from the heart of the cardiac Physiome Project. Prog. Biophys. Mol. Biol. 104, 2–21 (2011).
Space Station Research Explorer. NASA https://www.nasa.gov/mission_pages/station/research/experiments/explorer/ (accessed 1 October 2022).
Li, T., Sahu, A. K., Talwalkar, A. & Smith, V. Federated learning: challenges, methods, and future directions. IEEE Signal Process. Mag. 37, 50–60 (2020).
Shi, W. & Dustdar, S. The promise of edge computing. Computer 49, 78–81 (2016).
Acknowledgements
We thank all June 2021 participants and speakers at the ‘NASA Workshop on Artificial Intelligence and Modeling for Space Biology’. We thank the NASA Space Biology Program, part of the NASA Biological and Physical Sciences Division within the NASA Science Mission Directorate; as well as the NASA Human Research Program (HRP). We thank the Space Biosciences Division and Space Biology at Ames Research Center (ARC), especially D. Ly, R. Vik and P. Vaishampayan. We thank the support provided by NASA GeneLab, and the NASA Ames Life Sciences Data Archive. Additional thanks to S. Bhattacharya, NASA Space Biology Program Scientist; K. Martin, ARC Lead of Exploration Medical Capability (an Element of HRP); and L. Lewis, ARC NASA HRP Lead. Funding: S.V.C. is funded by NASA Human Research Program grant NNJ16HP24I. S.E.B holds the Heidrich Family and Friends endowed Chair in Neurology at the University of California, San Francisco (UCSF). S.E.B. also holds the Distinguished Professorship I in Neurology at UCSF. S.E.B is funded by an National Science Foundation Convergence Accelerator award (2033569) and NIH/NCATS Translator award (1OT2TR003450). G.I.M was supported by the Translational Research Institute for Space Health, through NASA NNX16AO69A (Project Number T0412). E.L.A. was supported by the Translational Research Institute for Space Health, through NASA NNX16AO69A. C.E.M. thanks NASA grants NNX14AH50G and NNX17AB26G. This work was also part of the DOE Agile BioFoundry, supported by the US Department of Energy, Energy Efficiency and Renewable Energy, Bioenergy Technologies Office, and the DOE Joint BioEnergy Institute, supported by the Office of Science, Office of Biological and Environmental Research, through contract DE-AC02- 05CH11231 between Lawrence Berkeley National Laboratory and the US Department of Energy. S.V.K. is funded by the Canadian Space Agency (19HLSRM04) and Natural Sciences and Engineering Research Council (NSERC, RGPIN-288253). J.H.Y. is funded by NIH grant R00 GM118907 and the Agilent Early Career Professor Award.
Author information
Authors and Affiliations
Contributions
All authors contributed ideas and discussion during the joint workshop writing session or were speakers at the ‘NASA Workshop on Artificial Intelligence and Modeling for Space Biology.’ L.M.S., R.T.S. and S.V.C. prepared the manuscript. All authors provided feedback on the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks Ilaria Cinelli, Christopher Bradburne and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sanders, L.M., Scott, R.T., Yang, J.H. et al. Biological research and self-driving labs in deep space supported by artificial intelligence. Nat Mach Intell 5, 208–219 (2023). https://doi.org/10.1038/s42256-023-00618-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-023-00618-4