AI identifies developmental defects and drug mechanisms in embryos

We developed EmbryoNet, a deep learning tool that can automatically identify and classify developmental defects caused by perturbations of signaling pathways in vertebrate embryos. The tool could help to elucidate the mechanisms of action of pharmaceuticals, potentially transforming the drug discovery process.


The problem
During early embryonic development, evolutionarily conserved signaling pathways work together to organize the body plan by controlling differentiation, tissue growth and cell migration 1 . Each signaling pathway activates a different set of genes necessary for body patterning, and disruption of signaling activity results in characteristic phenotypes with specific patterning and tissue defects. Reactivation or perturbation of the signaling pathways in adult tissues can cause the formation of tumors with embryo-like properties, abnormal cell migration and proliferation. However, because there is overlap between the phenotypes that result from the perturbation of different signaling pathways, they can be easily confused, even by experienced developmental biologists. Therefore, we looked for an automated approach capable of identifying complex phenotypes and linking them back to the appropriate signaling pathway, with the prospect of using this tool and the rich embryonic phenotypes for the discovery of novel drugs that modulate signaling pathways.

The solution
We reasoned that recent major advances in AI-based image analysis could be harnessed to achieve our goal 2 . To train our AI algorithm, a deep convolutional neural network named EmbryoNet, we first recorded high-quality movies of zebrafish embryoswhich are ideal as they are transparent, can be obtained in large numbers and develop from a single cell to the full body plan within one day. We combined the imaging with a chemical genetics approach to perturb the seven major developmental signaling pathways (Fig. 1a): bone morphogenetic protein (BMP), retinoic acid (RA), Wnt, fibroblast growth factor (FGF), Nodal, sonic hedgehog (Shh) and planar cell polarity (PCP) signaling. The resulting dataset, comprising around 2 million images, was then used to train EmbryoNet, which could robustly identify which signaling pathway had been perturbed in each embryo (Fig. 1b). Interestingly, we found that Em-bryoNet could identify these phenotypes at a much earlier embryonic stage than human evaluators. We also successfully trained EmbryoNet to recognize signaling defects in the evolutionarily distant species medaka (Oryzias latipes) and three-spined stickleback (Gasterosteus aculeatus).
We then tested EmbryoNet in a large-scale drug screen using a library of compounds that included pharmaceuticals prescribed in the clinic. EmbryoNet robustly identified cytotoxic drugs used in cancer therapy, such as vinblastine and bortezomib, which induced the deaths of zebrafish embryos. Importantly, Embry-oNet also identified previously unknown associations between prescription drugs and signaling defects in zebrafish. One prominent finding was that statinswidely prescribed drugs for lowering cholesterol levels in humans -caused a reduction in the levels of the FGF signal transducer phosphorylated Erk (pErk) in zebrafish embryos. This proof-of-concept experiment demonstrates that EmbryoNet can be used in drug screens to uncover links between small-molecule drugs and signaling pathways relevant for human health.

Future directions
We hope that EmbryoNet will be widely used. Its modular, open-source nature enables EmbryoNet to be easily adapted to a variety of purposes where automated phenotyping will expedite biological and pharmaceutical discovery, including use with embryos of other species and organoids. Furthermore, we provide a large database of annotated images that can be used in conjunction with -or independently of -EmbryoNet to advance AI-based models of animal development.
EmbryoNet identifies phenotypes at a much earlier developmental stage than human experts could, and we are now trying to elucidate the underlying basis. However, it is unclear whether EmbryoNet is also better than humans at recognizing very mild phenotypes, for example those induced by low drug concentrations. Furthermore, EmbryoNet currently relies on a library of manual annotations and cannot classify novel phenotypes such as those caused by combinatorial disruption of signaling pathways.
AI algorithms are already capable of a variety of tasks, such as the generation of contextualized artwork and text and the prediction of three-dimensional protein folding 3 . Rapid developments in AI algorithms in the future might help to address some of the current limitations of EmbryoNet. Building on these advances in AI will help us tackle challenging problems in developmental biology 2 and might eventually enable us to understand how complex animal body plans emerge from the information encoded in animals' genomes.

Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Behind the papeR
We started this project in February 2020, coinciding with the start of the COVID-19 pandemic. Although Germany's first lockdown restricted access to our institutes, much of the embryo imaging could be done in an automated fashion using a high-throughput microscope -we simply had to briefly come to the laboratory to start each new batch of recordings. Similarly, the manual image annotations used to train EmbryoNet could be largely done working from home. Despite the disruption of the pandemic, we were able to image thousands of embryos under different treatment conditions, yielding the millions of images that formed the knowledge base for EmbryoNet. The systematic evaluation of EmbryoNet's performance compared to a large group of human evaluators was possible only after COVID-19-related restrictions had been lifted. We were fortunate to work with more than 100 undergraduate students who were happy to contribute their evaluations to our ongoing research project. P.M.

fRom the editoR
"Deep learning is making waves in bioimage analysis, but applications in developmental biology are relatively rare. Reading this paper, I was stunned by the data quality, the comprehensiveness of the data as a resource, and how much could be learned about pathway perturbations from phenotypes largely invisible to the human eye. I think both the shared data and the approach will unlock insight into zebrafish development." Rita Strack, Senior Editor, Nature Methods.