The problem

During early embryonic development, evolutionarily conserved signaling pathways work together to organize the body plan by controlling differentiation, tissue growth and cell migration1. Each signaling pathway activates a different set of genes necessary for body patterning, and disruption of signaling activity results in characteristic phenotypes with specific patterning and tissue defects. Reactivation or perturbation of the signaling pathways in adult tissues can cause the formation of tumors with embryo-like properties, abnormal cell migration and proliferation. However, because there is overlap between the phenotypes that result from the perturbation of different signaling pathways, they can be easily confused, even by experienced developmental biologists. Therefore, we looked for an automated approach capable of identifying complex phenotypes and linking them back to the appropriate signaling pathway, with the prospect of using this tool and the rich embryonic phenotypes for the discovery of novel drugs that modulate signaling pathways.

The solution

We reasoned that recent major advances in AI-based image analysis could be harnessed to achieve our goal2. To train our AI algorithm, a deep convolutional neural network named EmbryoNet, we first recorded high-quality movies of zebrafish embryos — which are ideal as they are transparent, can be obtained in large numbers and develop from a single cell to the full body plan within one day. We combined the imaging with a chemical genetics approach to perturb the seven major developmental signaling pathways (Fig. 1a): bone morphogenetic protein (BMP), retinoic acid (RA), Wnt, fibroblast growth factor (FGF), Nodal, sonic hedgehog (Shh) and planar cell polarity (PCP) signaling. The resulting dataset, comprising around 2 million images, was then used to train EmbryoNet, which could robustly identify which signaling pathway had been perturbed in each embryo (Fig. 1b). Interestingly, we found that EmbryoNet could identify these phenotypes at a much earlier embryonic stage than human evaluators. We also successfully trained EmbryoNet to recognize signaling defects in the evolutionarily distant species medaka (Oryzias latipes) and three-spined stickleback (Gasterosteus aculeatus).

Fig. 1: EmbryoNet robustly identifies signaling defects in zebrafish embryos.
figure 1

a, Simplified schematic of an early zebrafish embryo showing the tissue regions where the major signaling pathways are active. b, Labeled zebrafish embryos with specific defects in the Nodal and BMP signaling pathways, caused by the inhibitors Lefty1 (purple) and Chordin (green), respectively. Unlabeled embryos are those left unperturbed (wild type). At early stages (sphere, left), embryos cannot be distinguished by their signaling defect and are classified as unknown (black squares). During late segmentation stages 24 hours post-fertilization (h.p.f., right), EmbryoNet robustly identifies the pathway-specific signaling defects in each embryo, shown by colored squares: green, reduced Nodal signaling; red, reduced BMP signaling; white, normal; magenta, dead. © 2023, Čapek, D. et al., CCBY 4.0.

We then tested EmbryoNet in a large-scale drug screen using a library of compounds that included pharmaceuticals prescribed in the clinic. EmbryoNet robustly identified cytotoxic drugs used in cancer therapy, such as vinblastine and bortezomib, which induced the deaths of zebrafish embryos. Importantly, EmbryoNet also identified previously unknown associations between prescription drugs and signaling defects in zebrafish. One prominent finding was that statins — widely prescribed drugs for lowering cholesterol levels in humans — caused a reduction in the levels of the FGF signal transducer phosphorylated Erk (pErk) in zebrafish embryos. This proof-of-concept experiment demonstrates that EmbryoNet can be used in drug screens to uncover links between small-molecule drugs and signaling pathways relevant for human health.

Future directions

We hope that EmbryoNet will be widely used. Its modular, open-source nature enables EmbryoNet to be easily adapted to a variety of purposes where automated phenotyping will expedite biological and pharmaceutical discovery, including use with embryos of other species and organoids. Furthermore, we provide a large database of annotated images that can be used in conjunction with — or independently of — EmbryoNet to advance AI-based models of animal development.

EmbryoNet identifies phenotypes at a much earlier developmental stage than human experts could, and we are now trying to elucidate the underlying basis. However, it is unclear whether EmbryoNet is also better than humans at recognizing very mild phenotypes, for example those induced by low drug concentrations. Furthermore, EmbryoNet currently relies on a library of manual annotations and cannot classify novel phenotypes such as those caused by combinatorial disruption of signaling pathways.

AI algorithms are already capable of a variety of tasks, such as the generation of contextualized artwork and text and the prediction of three-dimensional protein folding3. Rapid developments in AI algorithms in the future might help to address some of the current limitations of EmbryoNet. Building on these advances in AI will help us tackle challenging problems in developmental biology2 and might eventually enable us to understand how complex animal body plans emerge from the information encoded in animals’ genomes.

Patrick Müller

University of Konstanz, Konstanz, Germany.

Expert opinion

“The approach of using AI for linking images of embryos to known developmental signaling pathways is novel, and the results are impressive. EmbryoNet is a valuable addition to the tools that are available for phenotyping mutants, analyzing embryotoxic compounds, and increasing speed and throughput for these types of experiment.” Marc Muller, Université de Liège, Liège, Belgium.

Behind the paper

We started this project in February 2020, coinciding with the start of the COVID-19 pandemic. Although Germany’s first lockdown restricted access to our institutes, much of the embryo imaging could be done in an automated fashion using a high-throughput microscope — we simply had to briefly come to the laboratory to start each new batch of recordings. Similarly, the manual image annotations used to train EmbryoNet could be largely done working from home. Despite the disruption of the pandemic, we were able to image thousands of embryos under different treatment conditions, yielding the millions of images that formed the knowledge base for EmbryoNet.

The systematic evaluation of EmbryoNet’s performance compared to a large group of human evaluators was possible only after COVID-19-related restrictions had been lifted. We were fortunate to work with more than 100 undergraduate students who were happy to contribute their evaluations to our ongoing research project. P.M.

From the editor

“Deep learning is making waves in bioimage analysis, but applications in developmental biology are relatively rare. Reading this paper, I was stunned by the data quality, the comprehensiveness of the data as a resource, and how much could be learned about pathway perturbations from phenotypes largely invisible to the human eye. I think both the shared data and the approach will unlock insight into zebrafish development.” Rita Strack, Senior Editor, Nature Methods.