Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

BacterAI maps microbial metabolism without prior knowledge

Abstract

Training artificial intelligence (AI) systems to perform autonomous experiments would vastly increase the throughput of microbiology; however, few microbes have large enough datasets for training such a system. In the present study, we introduce BacterAI, an automated science platform that maps microbial metabolism but requires no prior knowledge. BacterAI learns by converting scientific questions into simple games that it plays with laboratory robots. The agent then distils its findings into logical rules that can be interpreted by human scientists. We use BacterAI to learn the amino acid requirements for two oral streptococci: Streptococcus gordonii and Streptococcus sanguinis. We then show how transfer learning can accelerate BacterAI when investigating new environments or larger media with up to 39 ingredients. Scientific gameplay and BacterAI enable the unbiased, autonomous study of organisms for which no training data exist.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: BacterAI uses an automated experiment pipeline and an AI agent to study bacterial metabolism.
Fig. 2: BacterAI selects experiments to train an internal neural network that predicts the fitness of S. gordonii.
Fig. 3: Transfer learning accelerates BacterAI.
Fig. 4: BacterAI quickly learns to predict growth for S. sanguinis in CDM with 39 ingredients.

Similar content being viewed by others

Data availability

All data are available at http://github.com/jensenlab/BacterAI and the authors’ website: http://jensenlab.net/tools.

Code availability

All code is available at http://github.com/jensenlab/BacterAI and the authors’ website: http://jensenlab.net/tools.

References

  1. Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).

    Article  CAS  PubMed  Google Scholar 

  2. Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013).

    Article  CAS  PubMed  Google Scholar 

  3. Coutant, A. et al. Closed-loop cycles of experiment design, execution, and learning accelerate systems biology model development in yeast. Proc. Natl Acad. Sci. USA 116, 18142–18147 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. King, R. D. et al. The automation of science. Science 324, 85–89 (2009).

    Article  CAS  PubMed  Google Scholar 

  5. Schrittwieser, J. et al. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588, 604–609 (2020).

    Article  CAS  PubMed  Google Scholar 

  6. Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).

    Article  CAS  PubMed  Google Scholar 

  7. Silver, D. et al. A general reinforcement learning algorithm that masters chess, dhogi, and Go through self-play. Science 362, 1140–1144 (2018).

    Article  CAS  PubMed  Google Scholar 

  8. Silver, D., Singh, S., Precup, D. & Sutton, R. S. Reward is enough. Artif. Intell. 299, 103535 (2021).

    Article  Google Scholar 

  9. Tesauro, G. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput. 6, 215–219 (1994).

    Article  Google Scholar 

  10. Tesauro, G. & Galperin, G. On-line policy improvement using Monte-Carlo Search. In Advances in Neural Information Processing Systems (eds Mozer, M. C. et al) (MIT Press, 1996); https://proceedings.neurips.cc/paper_files/paper/1996/file/996009f2374006606f4c0b0fda878af1-Paper.pdf

  11. Fel’dbaum, A. A. Theory of dual control. Autom. Remote Control 21, 1240–1249 (1960).

    Google Scholar 

  12. Witten, I. H. The apparent conflict between estimation and control—a survey of the two-armed bandit problem. J. Frankl. Inst. 301, 161–189 (1976).

    Article  Google Scholar 

  13. Patel, S. & Gupta, R. S. Robust demarcation of fourteen different species groups within the genus Streptococcus based on genome-based phylogenies and molecular signatures. Infect. Genet. Evol. 66, 130–151 (2018).

    Article  CAS  PubMed  Google Scholar 

  14. van de Rijn, I. & Kessler, R. E. Growth characteristics of group A streptococci in a new chemically defined medium. Infect. Immun. 27, 444–448 (1980).

    Article  PubMed  PubMed Central  Google Scholar 

  15. Lewin, G. R., Stocke, K. S., Lamont, R. J. & Whiteley, M. A quantitative framework reveals traditional laboratory growth is a highly accurate model of human oral infection. Proc. Natl Acad. Sci USA. https://doi.org/10.1073/pnas.2116637119 (2022).

  16. King, R. D. et al. Functional genomic hypothesis generation and experimentation by a robot scientist. Nature 427, 247–252 (2004).

    Article  CAS  PubMed  Google Scholar 

  17. Magnúsdóttir, S. et al. Generation of genome-scale metabolic reconstructions for 773 members of the human gut microbiota. Nat Biotechnol. https://doi.org/10.1038/nbt.3703 (2017).

  18. Jijakli, K. & Jensen, P. A. Metabolic modeling of Streptococcus mutans reveals complex nutrient requirements of an oral pathogen. mSystems. https://doi.org/10.1128/mSystems.00529-19 (2019).

  19. Dama, A. C. & Jensen, P. A. PlatePlan. GitHub https://github.com/jensenlab/PlatePlan (2020).

  20. Bellman, R. A Markovian decision process. J. Math. Mech. 6, 679–684 (1957).

    Google Scholar 

  21. Breiman, L. Bagging predictors. Mach. Learn. 24, 123–140 (1996).

    Article  Google Scholar 

  22. Glorot, X., Bordes, A. & Bengio, Y. Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics Vol. 15 (eds Gordon, G. et al.) 315–323 (PMLR, 2011); https://proceedings.mlr.press/v15/glorot11a.html

  23. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In 3rd International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) Preprint at arXiv https://doi.org/10.48550/arXiv.1412.6980 (2015).

  24. Akiba, T., Sano, S., Yanase, T., Ohta, T. & Koyama, M. Optuna: a next-generation hyperparameter optimization framework. In Proceedings of the 25rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining https://doi.org/10.1145/3292500.3330701 (Association for Computing Machinery, 2019).

  25. Holland, J. H. Adaptation in Natural and Artificial Systems: an Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence (MIT Press, 1992).

  26. Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32 (eds Wallach, H. et al.) 8024–8035 (Curran Associates, Inc., 2019); http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

Download references

Acknowledgements

This research was supported by the National Institutes of Health (grant nos. EB027396 and GM138210 to P.J.). The Titan V used for this research was donated by the NVIDIA Corporation. We thank K. Janes for his comments on the manuscript. Figures 1 and 3 were created with BioRender.com.

Author information

Authors and Affiliations

Authors

Contributions

A.D. and P.J. conceived the study, implemented BacterAI, designed experiments, analysed data and wrote the manuscript. A.D., K.K., D.L., A.L., N.S. and K.J. optimized laboratory workflows, executed experiments and performed quality control on the data.

Corresponding author

Correspondence to Paul A. Jensen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Microbiology thanks Nathan Price and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 BacterAI learns the amino acid requirements of S. gordonii.

BacterAI learns to predict the growth of S. gordonii in media containing combinations of amino acids. Over 13 days, the agent trains a neural network to guide the search for new experiments along the growth front. Each day, the neural network is retrained using the data from all previous data (train set). The accuracy of the model is measured using the 336 experiments selected each day (test set).

Extended Data Fig. 2 Media selected by BacterAI are not random.

The fitness of S. gordonii grown in randomly selected media clusters near no growth (0) and full growth (1). By contrast, experiments selected by BacterAI are more uniformly distributed.

Extended Data Fig. 3 BacterAI learns the amino acid requirements of S. sanguinis.

BacterAI selects experiments to learn the amino acid requirements of the bacterium Streptococcus sanguinis. Although S. sanguinis is genetically similar to S. gordonii, the bacteria have different amino acid auxotrophies. BacterAI began its investigation of S. sanguinis from a blank slate and did not carry over any knowledge from the previous experiments with S. gordonii.

Extended Data Fig. 4 BacterAI learns amino acid requirements of S. sanguinis using transfer learning.

BacterAI learns a growth model for S. gordonii using transfer learning. A growth model for S. sanguinis was used to select the initial experiments and was retrained with new growth data from S. gordonii.

Extended Data Fig. 5 BacterAI learns amino acid requirements of S. sanguinis in anaerobic conditions using transfer learning.

BacterAI learns an anaerobic growth model for S. sanguinis using transfer learning. A growth model for S. sanguinis in an aerobic (5% CO2) environment was used to select the initial experiments and was retrained with new growth data from anaerobic experiments.

Extended Data Fig. 6 Replicate growth assays show little variation.

Replicates from fourteen days of experiments with S. sanguinis show little variation in growth. Using a grow/no grow threshold of 0.25 gives 97.37% (1 vs. 2), 97.11% (1 vs. 3), and 97.39% (2 vs. 3) agreement between the replicates.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dama, A.C., Kim, K.S., Leyva, D.M. et al. BacterAI maps microbial metabolism without prior knowledge. Nat Microbiol 8, 1018–1025 (2023). https://doi.org/10.1038/s41564-023-01376-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41564-023-01376-0

Search

Quick links

Nature Briefing Microbiology

Sign up for the Nature Briefing: Microbiology newsletter — what matters in microbiology research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: Microbiology