Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Machine learning for antimicrobial peptide identification and design

Abstract

Artificial intelligence (AI) and machine learning (ML) models are being deployed in many domains of society and have recently reached the field of drug discovery. Given the increasing prevalence of antimicrobial resistance, as well as the challenges intrinsic to antibiotic development, there is an urgent need to accelerate the design of new antimicrobial therapies. Antimicrobial peptides (AMPs) are therapeutic agents for treating bacterial infections, but their translation into the clinic has been slow owing to toxicity, poor stability, limited cellular penetration and high cost, among other issues. Recent advances in AI and ML have led to breakthroughs in our abilities to predict biomolecular properties and structures and to generate new molecules. The ML-based modelling of peptides may overcome some of the disadvantages associated with traditional drug discovery and aid the rapid development and translation of AMPs. Here, we provide an introduction to this emerging field and survey ML approaches that can be used to address issues currently hindering AMP development. We also outline important limitations that can be addressed for the broader adoption of AMPs in clinical practice, as well as new opportunities in data-driven peptide design.

Key points

  • Machine learning (ML) can aid antimicrobial peptide (AMP) design and discovery. It can be applied to improve drug efficacy, predict medicinal chemistry and reduce the overall time and cost of drug development.

  • ML can be used for the prediction of therapeutic properties — such as antimicrobial efficacy, and absorption, distribution, metabolism, excretion and toxicity (ADMET) — and macromolecular structures.

  • Deep generative models are promising approaches to designing new AMPs.

  • Important limitations in AMP development include lack of selectivity, undesirable physicochemical and medicinal chemistry properties, unspecific or unknown mechanisms of action, high cost of peptide synthesis, and generation of industrial waste. ML can help to overcome these limitations by applying relevant models trained on high-quality datasets.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Timelines of major machine learning/artificial intelligence (ML/AI) events and recent studies of ML/AI-driven antimicrobial peptide (AMP) identification and design.
Fig. 2: Methods of representing peptides as inputs to machine learning models.
Fig. 3: Schematic illustration of deep generative models for antimicrobial peptides (AMPs).

Similar content being viewed by others

References

  1. Fjell, C. D., Hiss, J. A., Hancock, R. E. W. & Schneider, G. Designing antimicrobial peptides: form follows function. Nat. Rev. Drug. Discov. 11, 37–51 (2012).

    Article  CAS  Google Scholar 

  2. Yan, J. et al. Recent progress in the discovery and design of antimicrobial peptides using traditional machine learning and deep learning. Antibiotics 11, 1451 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Silva, O. N. et al. Repurposing a peptide toxin from wasp venom into antiinfectives with dual antimicrobial and immunomodulatory properties. PNAS 117, 26936–26945 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  4. Magana, M. et al. The value of antimicrobial peptides in the age of resistance. Lancet Infect. Dis. 20, e216–e230 (2020).

    Article  CAS  PubMed  Google Scholar 

  5. Bahar, A. & Ren, D. Antimicrobial peptides. Pharmaceuticals 6, 1543–1575 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Chen, C. H. & Lu, T. K. Development and challenges of antimicrobial peptides for therapeutic applications. Antibiotics 9, 24 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Dijksteel, G. S., Ulrich, M. M. W., Middelkoop, E. & Boekema, B. K. H. L. Review: lessons learned from clinical trials using antimicrobial peptides (AMPs). Front. Microbiol. 12, 616979 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Centers for Disease Control and Prevention (U.S.); National Center for Emerging Zoonotic and Infectious Diseases (U.S.), Division of Healthcare Quality Promotion, Antibiotic Resistance Coordination and Strategy Unit. Antibiotic Resistance Threats in the United States, 2019 CDC https://doi.org/10.15620/cdc:82532 (2019).

  9. Murray, C. J. et al. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet 399, 629–655 (2022).

    Article  CAS  Google Scholar 

  10. Santos-Júnior, C. D. et al. Computational exploration of the global microbiome for antibiotic discovery. Preprint at bioRxiv https://doi.org/10.1101/2023.08.31.555663 (2023).

  11. Torres, M. D. T. et al. Human gut metagenomic mining reveals an untapped source of peptide antibiotics. Preprint at bioRxiv https://doi.org/10.1101/2023.08.31.555711 (2023).

  12. Maasch, J. R. M. A., Torres, M. D. T., Melo, M. C. R. & de la Fuente-Nunez, C. Molecular de-extinction of ancient antimicrobial peptides enabled by machine learning. Cell Host Microbe 31, 1230–1274.e6 (2023). This study reports the use of machine learning (ML) to mine the proteomes of the archaic humans Neanderthals and Denisovans, leading to the discovery of the first antibiotics in extinct organisms (including Neanderthalin-1) and launching the field of molecular de-extinction.

    Article  Google Scholar 

  13. Wong, F., de la Fuente-Nunez, C. & Collins, J. J. Leveraging artificial intelligence in the fight against infectious diseases. Science 381, 164–170 (2023). This review summarizes state-of-the-art artificial intelligence (AI)/ML approaches to addressing infectious diseases through the lens of biotechnology and medicine.

    Article  ADS  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

  14. Ma, Y. et al. Identification of antimicrobial peptides from the human gut microbiome using deep learning. Nat. Biotechnol. 40, 921–931 (2022). This study reports the use of multiple language processing neural network models to identify 181 antimicrobial peptides (AMPs) with antimicrobial activity from the human gut microbiome, three of which were validated in vivo in a mouse model of bacterial lung infection.

    Article  CAS  PubMed  Google Scholar 

  15. Huang, J. et al. Identification of potent antimicrobial peptides via a machine-learning pipeline that mines the entire space of peptide sequences. Nat. Biomed. Eng. 7, 797–810 (2023). This study applies a cascading pipeline consisting of multiple ML modules to identify 54 AMPs with antimicrobial activity from combinatorial peptide space.

    Article  CAS  PubMed  Google Scholar 

  16. Wan, F., Torres, M. D. T., Peng, J. & de la Fuente-Nunez, C. Molecular de-extinction of antibiotics enabled by deep learning. Preprint at bioRxiv https://doi.org/10.1101/2023.10.01.560353 (2023).

  17. Torres, M. D. T. & de la Fuente-Nunez, C. Toward computer-made artificial antibiotics. Curr. Opin. Microbiol. 51, 30–38 (2019). This review outlines the emerging field of antibiotic discovery enabled by computers.

    Article  CAS  PubMed  Google Scholar 

  18. Chen, C. H., Bepler, T., Pepper, K., Fu, D. & Lu, T. K. Synthetic molecular evolution of antimicrobial peptides. Curr. Opin. Biotechnol. 75, 102718 (2022).

    Article  CAS  PubMed  Google Scholar 

  19. Palmer, N., Maasch, J. R. M. A., Torres, M. D. T. & de la Fuente-Nunez, C. Molecular dynamics for antimicrobial peptide discovery. Infect. Immun. 89, e00703-20 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  21. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1 (Long and Short Papers) (eds Burstein, J. et al.) 4171–4186 (Association for Computational Linguistics, 2019).

  22. Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).

    Article  ADS  CAS  PubMed  Google Scholar 

  23. Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2017).

    Article  Google Scholar 

  24. Valeri, J. A. et al. Sequence-to-function deep learning frameworks for engineered riboregulators. Nat. Commun. 11, 5058 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  25. Angenent-Mari, N. M., Garruss, A. S., Soenksen, L. R., Church, G. & Collins, J. J. A deep learning approach to programmable RNA switches. Nat. Commun. 11, 5057 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  26. Lee, J. et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 1234–1240 (2020).

    Article  CAS  PubMed  Google Scholar 

  27. Gu, Y. et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc. 3, 1–23 (2022).

    Article  Google Scholar 

  28. Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702.e13 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Jin, W. et al. Deep learning identifies synergistic drug combinations for treating COVID-19. Proc. Natl Acad. Sci. USA 118, e2015070118 (2021).

    Article  Google Scholar 

  30. Wong, F., Omori, S., Donghia, N. M., Zheng, E. J. & Collins, J. J. Discovering small-molecule senolytics with deep neural networks. Nat. Aging 3, 734–750 (2023).

    Article  CAS  PubMed  Google Scholar 

  31. Soenksen, L. R. et al. Using deep learning for dermatologist-level detection of suspicious pigmented skin lesions from wide-field images. Sci. Transl. Med. 13, eabb3652 (2021).

    Article  CAS  PubMed  Google Scholar 

  32. Liu, G. et al. Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii. Nat. Chem. Biol. 19, 1342–1350 (2023).

    Article  ADS  CAS  PubMed  Google Scholar 

  33. Zheng, E. J. et al. Discovery of antibiotics that selectively kill metabolically dormant bacteria. Cell Chem. Biol. https://doi.org/10.1016/j.chembiol.2023.10.026 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Wong, F. et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626, 177–185 (2021).

    Article  Google Scholar 

  35. Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Kim, H. K. et al. Deep learning improves prediction of CRISPR–Cpf1 guide RNA activity. Nat. Biotechnol. 36, 239–241 (2018).

    Article  CAS  PubMed  Google Scholar 

  37. Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015).

    Article  CAS  PubMed  Google Scholar 

  39. Wu, R. et al. High-resolution de novo structure prediction from primary sequence. Preprint at bioRxiv https://doi.org/10.1101/2022.07.21.500999 (2022).

  40. Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  41. Vamathevan, J. et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug. Discov. 18, 463–477 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Li, S. et al. MONN: a multi-objective neural network for predicting compound-protein interactions and affinities. Cell Syst. 10, 308–322.e11 (2020).

    Article  CAS  Google Scholar 

  43. Ge, Y. et al. An integrative drug repositioning framework discovered a potential therapeutic agent targeting COVID-19. Signal. Transduct. Target. Ther. 6, 165 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37, 1038–1040 (2019).

    Article  CAS  PubMed  Google Scholar 

  45. Rajkomar, A. et al. Scalable and accurate deep learning with electronic health records. NPJ Digit. Med. 1, 18 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Shen, D., Wu, G. & Suk, H.-I. Deep learning in medical image analysis. Annu. Rev. Biomed. Eng. 19, 221–248 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Melo, M. C. R., Maasch, J. R. M. A. & de la Fuente-Nunez, C. Accelerating antibiotic discovery through artificial intelligence. Commun. Biol. 4, 1050 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Das, P. et al. Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations. Nat. Biomed. Eng. 5, 613–623 (2021). This study reports the use of a deep generative autoencoder to generate AMPs that were synthesized and tested for antimicrobial activity in vitro and for toxicity in mice.

    Article  CAS  PubMed  Google Scholar 

  49. Torres, M. D. T. et al. Mining for encrypted peptide antibiotics in the human proteome. Nat. Biomed. Eng. 6, 67–75 (2022). This article reports the exploration of the human proteome as a source of antibiotics, leading to the discovery of thousands of previously unrecognized antimicrobial sequences, and providing a new framework for antibiotic discovery by mining entire proteomes.

    Article  PubMed  Google Scholar 

  50. Porto, W. F. et al. In silico optimization of a guava antimicrobial peptide enables combinatorial exploration for peptide design. Nat. Commun. 9, 1490 (2018). This article describes an antibiotic molecule designed by a computer, called guavanin 2, which displays anti-infective properties in vivo.

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  51. Xu, J. et al. Comprehensive assessment of machine learning-based methods for predicting antimicrobial peptides. Brief Bioinform. 22, bbab083 (2021).

    Article  PubMed  Google Scholar 

  52. Osorio, D., Rondón-Villarreal, P. & Torres, R. Peptides: a package for data mining of antimicrobial peptides. R J. 7, 4–14 (2015).

    Article  Google Scholar 

  53. van Westen, G. J. et al. Benchmarking of protein descriptor sets in proteochemometric modeling (part 1): comparative study of 13 amino acid descriptor sets. J. Cheminform 5, 41 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Müller, A. T., Gabernet, G., Hiss, J. A. & Schneider, G. modlAMP: Python for antimicrobial peptides. Bioinformatics 33, 2753–2755 (2017).

    Article  PubMed  Google Scholar 

  55. Romero‐Molina, S., Ruiz‐Blanco, Y. B., Green, J. R. & Sanchez‐Garcia, E. ProtDCal‐Suite: a web server for the numerical codification and functional analysis of proteins. Protein Sci. 28, 1734–1743 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Barigye, S. J., Gómez‐Ganau, S., Serrano‐Candelas, E. & Gozalbes, R. PeptiDesCalculator: software for computation of peptide descriptors. Definition, implementation and case studies for 9 bioactivity endpoints. Proteins 89, 174–184 (2021).

    Article  CAS  PubMed  Google Scholar 

  57. Chen, Z. et al. iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics 34, 2499–2502 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Saeys, Y., Inza, I. & Larranaga, P. A review of feature selection techniques in bioinformatics. Bioinformatics 23, 2507–2517 (2007).

    Article  CAS  PubMed  Google Scholar 

  59. Chen, X. et al. Sequence-based peptide identification, generation, and property prediction with deep learning: a review. Mol. Syst. Des. Eng. 6, 406–428 (2021).

    Article  CAS  Google Scholar 

  60. Kawashima, S. AAindex: amino acid index database. Nucleic Acids Res. 28, 374–374 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. ElAbd, H. et al. Amino acid encoding for deep learning applications. BMC Bioinform. 21, 235 (2020).

    Article  CAS  Google Scholar 

  62. Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).

    Article  CAS  PubMed  Google Scholar 

  63. Chung, J., Gülçehre, Ç., Cho, K. & Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. Preprint at arXiv.org/abs/1412.3555 (2014).

  64. Wan, F., Kontogiorgos-Heintz, D. & de la Fuente-Nunez, C. Deep generative models for peptide design. Digit. Discov. 1, 195–208 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Yan, K., Lv, H., Guo, Y., Peng, W. & Liu, B. sAMPpred-GAT: prediction of antimicrobial peptide by graph attention network and predicted peptide structure. Bioinformatics 39, btac715 (2023).

    Article  CAS  PubMed  Google Scholar 

  66. Ganea, O. et al. GeoMol: torsional geometric generation of molecular 3D conformer ensembles. Adv. Neur. Inf. Process Syst. 34, 13757–13769 (2021).

  67. Jin, W., Wohlwend, J., Barzilay, R. & Jaakkola, T. S. Iterative refinement graph neural network for antibody sequence-structure co-design. In Proc. 10th International Conference on Learning Representations, ICLR 2022 (OpenReview.net, 2022).

  68. Maturana, D. & Scherer, S. VoxNet: a 3D convolutional neural network for real-time object recognition. In Proc. 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 922–928 (IEEE, 2015).

  69. Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).

    Article  Google Scholar 

  70. Jiménez, J., Doerr, S., Martínez-Rosell, G., Rose, A. S. & De Fabritiis, G. DeepSite: protein-binding site predictor using 3D-convolutional neural networks. Bioinformatics 33, 3036–3042 (2017).

    Article  PubMed  Google Scholar 

  71. Jones, D. et al. Improved protein–ligand binding affinity prediction with structure-based deep fusion inference. J. Chem. Inf. Model. 61, 1583–1592 (2021).

    Article  CAS  PubMed  Google Scholar 

  72. Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1798–1828 (2013).

    Article  PubMed  Google Scholar 

  73. Grill, J.-B. et al. Bootstrap your own latent — a new approach to self-supervised learning. Adv. Neur. Inf. Process Syst. 33, 21271–21284 (2020).

  74. Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient estimation of word representations in vector space. In Proc. 1st International Conference on Learning Representations, ICLR 2013, Workshop Track (eds Bengio, Y. & LeCun, Y.) (OpenReview.net, 2013).

  75. Brown, T. B. et al. Language models are few-shot learners. Preprint at https://doi.org/10.48550/arXiv.2005.14165 (2020).

  76. Madani, A. et al. Large language models generate functional protein sequences across diverse families. Nat. Biotechnol. 41, 1099–1106 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Alley, E. C., Khimulya, G., Biswas, S., AlQuraishi, M. & Church, G. M. Unified rational protein engineering with sequence-based deep representation learning. Nat. Methods 16, 1315–1322 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Rong, Y. et al. Self-supervised graph transformer on large-scale molecular data. In NIPS'20: Proc. 34th International Conference on Neural Information Processing Systems (eds Larochelle, H. et al.) 12559–12571 (Curran Assoc., 2020).

  79. Zang, X., Zhao, X. & Tang, B. Hierarchical molecular graph self-supervised learning for property prediction. Commun. Chem. 6, 34 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  80. Geourjon, C. & Deléage, G. SOPM: a self-optimized method for protein secondary structure prediction. Protein Eng. 7, 157–164 (1994).

    Article  CAS  PubMed  Google Scholar 

  81. Cao, X. et al. PSSP-MVIRT: peptide secondary structure prediction based on a multi-view deep learning architecture. Brief Bioinform. 22, bbab203 (2021).

    Article  PubMed  Google Scholar 

  82. Peri, S., Steen, H. & Pandey, A. GPMAW – a software tool for analyzing proteins and peptides. Trends Biochem. Sci. 26, 687–689 (2001).

    Article  CAS  PubMed  Google Scholar 

  83. Pereira, J. et al. High‐accuracy protein structure prediction in CASP14. Protein 89, 1687–1699 (2021).

    Article  CAS  Google Scholar 

  84. Robin, X. et al. Continuous Automated Model EvaluatiOn (CAMEO) — perspectives on the future of fully automated evaluation of structure prediction methods. Proteins 89, 1977–1986 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Vaswani, A. et al. Attention is all you need. In NIPS'17: Proc. 31st International Conference on Neural Information Processing Systems (eds Guyon, I. et al.) 6000–6010 (Curran Assoc., 2017).

  86. Berman, H. M. The protein data bank. Nucleic Acids Res. 28, 235–242 (2000).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  87. Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article  ADS  MathSciNet  CAS  PubMed  Google Scholar 

  88. McDonald, E. F., Jones, T., Plate, L., Meiler, J. & Gulsevin, A. Benchmarking AlphaFold2 on peptide structure prediction. Structure 31, 111–119.e2 (2023).

    Article  CAS  PubMed  Google Scholar 

  89. Lamiable, A. et al. PEP-FOLD3: faster de novo structure prediction for linear peptides in solution and in complex. Nucleic Acids Res. 44, W449–W454 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Timmons, P. B. & Hewage, C. M. APPTEST is a novel protocol for the automatic prediction of peptide tertiary structures. Brief Bioinform. 22, bbab308 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  91. Boaro, A. et al. Structure-function-guided design of synthetic peptides with anti-infective activity derived from wasp venom. Cell Rep. Phys. Sci. 4, 101459 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Torres, M. D. T. et al. Structure-function-guided exploration of the antimicrobial peptide polybia-CP identifies activity determinants and generates synthetic therapeutic candidates. Commun. Biol. 1, 221 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  93. Wong, F. et al. Benchmarking‐enabled molecular docking predictions for antibiotic discovery. Mol. Syst. Biol. 18, e11081 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Luo, S., Shi, C., Xu, M. & Tang, J. Predicting molecular conformation via dynamic graph score matching. Adv. Neur. Inf. Process Syst. 34, 19784–19795 (2021).

  95. Hoogeboom, E. et al. Equivariant diffusion for molecule generation in 3D. Proc. Mach. Learn. Res. 162, 8867–8887 (PMLR, 2022).

  96. Xu, M. et al. GeoDiff: a geometric diffusion model for molecular conformation generation. In Proc. 10th International Conference on Learning Representations, ICLR 2022 (OpenReview.net, 2022).

  97. Mansimov, E., Mahmood, O., Kang, S. & Cho, K. Molecular geometry prediction using a deep generative graph neural network. Sci. Rep. 9, 20381 (2019).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  98. Gogineni, T. et al. TorsionNet: a reinforcement learning approach to sequential conformer search. In NIPS'20: Proc. 34th International Conference on Neural Information Processing Systems (eds Larochelle, H. et al.) 20142–20153 (ACM, 2020).

  99. Janson, G., Valdes-Garcia, G., Heo, L. & Feig, M. Direct generation of protein conformational ensembles via machine learning. Nat. Commun. 14, 774 (2023).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  100. Pirtskhalava, M. et al. DBAASP v3: database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics. Nucleic Acids Res. 49, D288–D297 (2021).

    Article  CAS  PubMed  Google Scholar 

  101. García-Jacas, C. R., Pinacho-Castellanos, S. A., García-González, L. A. & Brizuela, C. A. Do deep learning models make a difference in the identification of antimicrobial peptides? Brief Bioinform. 23, bbac094 (2022).

    Article  PubMed  Google Scholar 

  102. Sidorczuk, K. et al. Benchmarks in antimicrobial peptide prediction are biased due to the selection of negative data. Brief Bioinform. 23, bbac343 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  103. Waghu, F. H., Barai, R. S., Gurung, P. & Idicula-Thomas, S. CAMPR3: a database on sequences, structures and signatures of antimicrobial peptides: Table 1. Nucleic Acids Res. 44, D1094–D1097 (2016).

    Article  CAS  PubMed  Google Scholar 

  104. Zhao, X., Wu, H., Lu, H., Li, G. & Huang, Q. LAMP: a database linking antimicrobial peptides. PLoS ONE 8, e66557 (2013).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  105. Witten, J. & Witten, Z. Deep learning regression model for antimicrobial peptide design. Preprint at bioRxiv https://doi.org/10.1101/692681 (2019).

  106. Wang, G., Li, X. & Wang, Z. APD3: the antimicrobial peptide database as a tool for research and education. Nucleic Acids Res. 44, D1087–D1093 (2016).

    Article  CAS  PubMed  Google Scholar 

  107. Meher, P. K., Sahu, T. K., Saini, V. & Rao, A. R. Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general PseAAC. Sci. Rep. 7, 42362 (2017).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  108. Xiao, X., Wang, P., Lin, W.-Z., Jia, J.-H. & Chou, K.-C. iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types. Anal. Biochem. 436, 168–177 (2013).

    Article  CAS  PubMed  Google Scholar 

  109. Fingerhut, L. C. H. W., Miller, D. J., Strugnell, J. M., Daly, N. L. & Cooke, I. R. ampir: an R package for fast genome-wide prediction of antimicrobial peptides. Bioinformatics 36, 5262–5263 (2021).

    Article  PubMed  Google Scholar 

  110. Santos-Júnior, C. D., Pan, S., Zhao, X.-M. & Coelho, L. P. Macrel: antimicrobial peptide screening in genomes and metagenomes. PeerJ 8, e10555 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  111. Burdukiewicz, M. et al. Proteomic screening for prediction and design of antimicrobial peptides with AmpGram. Int. J. Mol. Sci. 21, 4310 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. Lawrence, T. J. et al. amPEPpy 1.0: a portable and accurate antimicrobial peptide prediction tool. Bioinformatics 37, 2058–2060 (2021).

    Article  CAS  PubMed  Google Scholar 

  113. Bhadra, P., Yan, J., Li, J., Fong, S. & Siu, S. W. I. AmPEP: sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest. Sci. Rep. 8, 1697 (2018).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  114. Pane, K. et al. Antimicrobial potency of cationic antimicrobial peptides can be predicted from their amino acid composition: application to the detection of ‘cryptic’ antimicrobial peptides. J. Theor. Biol. 419, 254–265 (2017).

    Article  ADS  CAS  PubMed  Google Scholar 

  115. Yan, J. et al. Deep-AmPEP30: improve short antimicrobial peptides prediction with deep learning. Mol. Ther. Nucleic Acids 20, 882–894 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  116. Veltri, D., Kamath, U. & Shehu, A. Deep learning improves antimicrobial peptide recognition. Bioinformatics 34, 2740–2747 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Xiao, X., Shao, Y.-T., Cheng, X. & Stamatovic, B. iAMP-CA2L: a new CNN-BiLSTM-SVM classifier based on cellular automata image for identifying antimicrobial peptides and their functional types. Brief. Bioinform. 22, bbab209 (2021).

    Article  PubMed  Google Scholar 

  118. Robles-Loaiza, A. A. et al. Traditional and computational screening of non-toxic peptides and approaches to improving selectivity. Pharmaceuticals 15, 323 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Plisson, F., Ramírez-Sánchez, O. & Martínez-Hernández, C. Machine learning-guided discovery and design of non-hemolytic peptides. Sci. Rep. 10, 16581 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  120. Chaudhary, K. et al. A web server and mobile app for computing hemolytic potency of peptides. Sci. Rep. 6, 22843 (2016).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  121. Win, T. S. et al. HemoPred: a web server for predicting the hemolytic activity of peptides. Future Med. Chem. 9, 275–291 (2017).

    Article  ADS  CAS  PubMed  Google Scholar 

  122. Hasan, M. M. et al. HLPpred-Fuse: improved and robust prediction of hemolytic peptide and its activity by fusing multiple feature representation. Bioinformatics 36, 3350–3356 (2020).

    Article  CAS  PubMed  Google Scholar 

  123. Gautam, A. et al. Hemolytik: a database of experimentally determined hemolytic and non-hemolytic peptides. Nucleic Acids Res. 42, D444–D449 (2014).

    Article  CAS  PubMed  Google Scholar 

  124. Zakharova, E., Orsi, M., Capecchi, A. & Reymond, J. Machine learning guided discovery of non‐hemolytic membrane disruptive anticancer peptides. ChemMedChem 17, e202200291 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  125. Timmons, P. B. & Hewage, C. M. HAPPENN is a novel tool for hemolytic activity prediction for therapeutic peptides which employs neural networks. Sci. Rep. 10, 10869 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  126. Capecchi, A. et al. Machine learning designs non-hemolytic antimicrobial peptides. Chem. Sci. 12, 9221–9232 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  127. Salem, M., Keshavarzi Arshadi, A. & Yuan, J. S. AMPDeep: hemolytic activity prediction of antimicrobial peptides using transfer learning. BMC Bioinform. 23, 389 (2022).

    Article  Google Scholar 

  128. Gupta, S. et al. In silico approach for predicting toxicity of peptides and proteins. PLoS ONE 8, e73957 (2013).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  129. Sharma, N., Naorem, L. D., Jain, S. & Raghava, G. P. S. ToxinPred2: an improved method for predicting toxicity of proteins. Brief. Bioinform. 23, bbac174 (2022).

    Article  PubMed  Google Scholar 

  130. Naamati, G., Askenazi, M. & Linial, M. ClanTox: a classifier of short animal toxins. Nucleic Acids Res. 37, W363–W368 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  131. Wei, L., Ye, X., Sakurai, T., Mu, Z. & Wei, L. ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning. Bioinformatics 38, 1514–1524 (2022).

    Article  CAS  PubMed  Google Scholar 

  132. Wei, L., Ye, X., Xue, Y., Sakurai, T. & Wei, L. ATSE: a peptide toxicity predictor by exploiting structural and evolutionary information based on graph neural network and attention mechanism. Brief. Bioinform. 22, bbab041 (2021).

    Article  PubMed  Google Scholar 

  133. Zhang, J., Zhang, Z., Pu, L., Tang, J. & Guo, F. AIEpred: an ensemble predictive model of classifier chain to identify anti-inflammatory peptides. IEEE/ACM Trans. Comput. Biol. Bioinform. 18, 1831–1840 (2021).

    Article  ADS  PubMed  Google Scholar 

  134. Khatun, M. S., Hasan, M. M. & Kurata, H. PreAIP: computational prediction of anti-inflammatory peptides by integrating multiple complementary features. Front. Genet. 10, 219 (2019).

    Article  Google Scholar 

  135. Manavalan, B., Shin, T. H., Kim, M. O. & Lee, G. AIPpred: sequence-based prediction of anti-inflammatory peptides using random forest. Front. Pharmacol. 9, 276 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  136. Gupta, S., Sharma, A. K., Shastri, V., Madhu, M. K. & Sharma, V. K. Prediction of anti-inflammatory proteins/peptides: an insilico approach. J. Transl. Med. 15, 7 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  137. Gupta, S., Madhu, M. K., Sharma, A. K. & Sharma, V. K. ProInflam: a webserver for the prediction of proinflammatory antigenicity of peptides and proteins. J. Transl. Med. 14, 178 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  138. Manavalan, B., Shin, T. H., Kim, M. O. & Lee, G. PIP-EL: a new ensemble learning method for improved proinflammatory peptide predictions. Front. Immunol. 9, 1783 (2018).

  139. Khatun, M. S., Hasan, M. M., Shoombuatong, W. & Kurata, H. ProIn-Fuse: improved and robust prediction of proinflammatory peptides by fusing of multiple feature representations. J. Comput. Aided Mol. Des. 34, 1229–1236 (2020).

  140. Boeckmann, B. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31, 365–370 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  141. Sharma, A., Singla, D., Rashid, M. & Raghava, G. P. S. Designing of peptides with desired half-life in intestine-like environment. BMC Bioinform. 15, 282 (2014).

    Article  Google Scholar 

  142. Yin, S., Ding, F. & Dokholyan, N. V. Eris: an automated estimator of protein stability. Nat. Methods 4, 466–467 (2007).

    Article  CAS  PubMed  Google Scholar 

  143. Persikov, A. V., Ramshaw, J. A. M. & Brodsky, B. Prediction of collagen stability from amino acid sequence. J. Biol. Chem. 280, 19343–19349 (2005).

    Article  CAS  PubMed  Google Scholar 

  144. Wang, F. et al. Advancing oral delivery of biologics: machine learning predicts peptide stability in the gastrointestinal tract. Int. J. Pharm. 634, 122643 (2023).

    Article  CAS  PubMed  Google Scholar 

  145. Mathur, D., Singh, S., Mehta, A., Agrawal, P. & Raghava, G. P. S. In silico approaches for predicting the half-life of natural and modified peptides in blood. PLoS ONE 13, e0196829 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  146. Cardoso, M. H. et al. Non-lytic antibacterial peptides that translocate through bacterial membranes to act on intracellular targets. Int. J. Mol. Sci. 20, 4877 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  147. Ho, Y.-H., Shah, P., Chen, Y.-W. & Chen, C.-S. Systematic analysis of intracellular-targeting antimicrobial peptides, bactenecin 7, hybrid of pleurocidin and dermaseptin, proline–arginine-rich peptide, and lactoferricin b, by using Escherichia coli proteome microarrays. Mol. Cell. Proteom. 15, 1837–1847 (2016).

    Article  CAS  Google Scholar 

  148. Schissel, C. K. et al. Deep learning to design nuclear-targeting abiotic miniproteins. Nat. Chem. 13, 992–1000 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  149. Fu, X., Cai, L., Zeng, X. & Zou, Q. StackCPPred: a stacking and pairwise energy content-based prediction of cell-penetrating peptides and their uptake efficiency. Bioinformatics 36, 3028–3034 (2020).

    Article  CAS  PubMed  Google Scholar 

  150. Nasiri, F., Atanaki, F. F., Behrouzi, S., Kavousi, K. & Bagheri, M. CpACpP: in silico cell-penetrating anticancer peptide prediction using a novel bioinformatics framework. ACS Omega 6, 19846–19859 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  151. Wolfe, J. M. et al. Machine learning to predict cell-penetrating peptides for antisense delivery. ACS Cent. Sci. 4, 512–520 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  152. Kumar, V. et al. Prediction of cell-penetrating potential of modified peptides containing natural and chemically modified residues. Front. Microbiol. 9, 725 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  153. Manavalan, B., Subramaniyam, S., Shin, T. H., Kim, M. O. & Lee, G. Machine-learning-based prediction of cell-penetrating peptides and their uptake efficiency with improved accuracy. J. Proteome Res. 17, 2715–2726 (2018).

    Article  CAS  PubMed  Google Scholar 

  154. Sanders, W. S., Johnston, C. I., Bridges, S. M., Burgess, S. C. & Willeford, K. O. Prediction of cell penetrating peptides by support vector machines. PLoS Comput. Biol. 7, e1002101 (2011).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  155. Qiang, X. et al. CPPred-FL: a sequence-based predictor for large-scale identification of cell-penetrating peptides by feature representation learning. Brief. Bioinform 21, 11–23 (2018).

    Google Scholar 

  156. Lei, Y. et al. A deep-learning framework for multi-level peptide–protein interaction prediction. Nat. Commun. 12, 5465 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  157. Cunningham, J. M., Koytiger, G., Sorger, P. K. & AlQuraishi, M. Biophysical prediction of protein–peptide interactions and signaling networks using machine learning. Nat. Methods 17, 175–183 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  158. Li, Z., Miao, Q., Yan, F., Meng, Y. & Zhou, P. Machine learning in quantitative protein–peptide affinity prediction: implications for therapeutic peptide design. Curr. Drug. Metab. 20, 170–176 (2019).

    Article  CAS  PubMed  Google Scholar 

  159. Trisciuzzi, D., Siragusa, L., Baroni, M., Cruciani, G. & Nicolotti, O. An integrated machine learning model to spot peptide binding pockets in 3D protein screening. J. Chem. Inf. Model. 62, 6812–6824 (2022).

    Article  CAS  PubMed  Google Scholar 

  160. Wang, R., Jin, J., Zou, Q., Nakai, K. & Wei, L. Predicting protein–peptide binding residues via interpretable deep learning. Bioinformatics 38, 3351–3360 (2022).

    Article  CAS  PubMed  Google Scholar 

  161. Müller, R., Kornblith, S. & Hinton, G. When does label smoothing help? In Proc. 33rd International Conference on Neural Information Processing Systems (eds Wallach, H. M. et al.) 4694–4703 (ACM, 2019).

  162. Imani, E. & White, M. Improving regression performance with distributional losses. In Proc. 35th International Conference on Machine Learning, Vol. 80 (eds Dy, J. G. & Krause, A.) 2162–2171 (PMLR, 2018).

  163. Bekker, J. & Davis, J. Learning from positive and unlabeled data: a survey. Mach. Learn. 109, 719–760 (2020).

    Article  MathSciNet  Google Scholar 

  164. Yoshida, M. et al. Using evolutionary algorithms and machine learning to explore sequence space for the discovery of antimicrobial peptides. Chem 4, 533–543 (2018).

    Article  CAS  Google Scholar 

  165. Boone, K., Wisdom, C., Camarda, K., Spencer, P. & Tamerler, C. Combining genetic algorithm with machine learning strategies for designing potent antimicrobial peptides. BMC Bioinforma. 22, 239 (2021).

    Article  Google Scholar 

  166. Rezende, D. J., Mohamed, S. & Wierstra, D. Stochastic backpropagation and approximate inference in deep generative models. In Proc. 31st International Conference on Machine Learning, Vol. 32 (eds Xing, E. P. & Jebara, T.) 1278–1286 (PMLR, 2014).

  167. Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. In Proc. 2nd International Conference on Learning Representations, ICLR 2014, Conference Track (eds Bengio, Y. & LeCun, Y.) (OpenReview.net, 2014).

  168. Rezende, D. & Mohamed, S. Variational inference with normalizing flows. In Proc. 32nd International Conference on Machine Learning, Vol. 37 (eds Bach, F. & Blei, D.) 1530–1538 (PMLR, 2015).

  169. Goodfellow, I. et al. Generative adversarial networks. Commun. ACM 63, 139–144 (2020).

    Article  Google Scholar 

  170. Song, Y. et al. Score-based generative modeling through stochastic differential equations. In 9th International Conference on Learning Representations, ICLR 2021 (OpenReview.net, 2021).

  171. Ho, J., Jain, A. & Abbeel, P. Denoising diffusion probabilistic models. In Proc. 34th Conference on Neural Information Processing Systems, Advances in Neural Information Processing Systems 33 (eds Larochelle, H. et al.) (NeurIPS, 2020).

  172. Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N. & Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In Proc. 32nd International Conference on Machine Learning. Vol. 37 (eds Bach, F. & Blei, D.) 2256–2265 (PMLR, 2015).

  173. Müller, A. T., Hiss, J. A. & Schneider, G. Recurrent neural network model for constructive peptide design. J. Chem. Inf. Model. 58, 472–479 (2018).

    Article  PubMed  Google Scholar 

  174. Nagarajan, D. et al. Computational antimicrobial peptide design and evaluation against multidrug-resistant clinical isolates of bacteria. J. Biol. Chem. 293, 3492–3509 (2018).

    Article  CAS  PubMed  Google Scholar 

  175. Wang, C., Garlick, S. & Zloh, M. Deep learning for novel antimicrobial peptide design. Biomolecules 11, 471 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  176. Dean, S. N. & Walper, S. A. Variational autoencoder for generation of antimicrobial peptides. ACS Omega 5, 20746–20754 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  177. Dean, S. N., Alvarez, J. A. E., Zabetakis, D., Walper, S. A. & Malanoski, A. P. PepVAE: variational autoencoder framework for antimicrobial peptide generation and activity prediction. Front. Microbiol. https://doi.org/10.3389/fmicb.2021.725727 (2021).

  178. UniProt Consrtioum. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res 47, D506–D515 (2019).

    Article  Google Scholar 

  179. Arjovsky, M. & Bottou, L. Towards principled methods for training generative adversarial networks. In Proc. 5th International Conference on Learning Representations, ICLR 2017, Conference Track (OpenReview.net, 2017).

  180. Tucs, A. et al. Generating ampicillin-level antimicrobial peptides with activity-aware generative adversarial networks. ACS Omega 5, 22847–22851 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  181. Van Oort, C. M., Ferrell, J. B., Remington, J. M., Wshah, S. & Li, J. AMPGAN v2: machine learning-guided design of antimicrobial peptides. J. Chem. Inf. Model. 61, 2198–2207 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  182. Cao, Q. et al. Designing antimicrobial peptides using deep learning and molecular dynamic simulations. Brief. Bioinform 24, bbad058 (2023).

    Article  PubMed  Google Scholar 

  183. Ferrell, J. B. et al. A generative approach toward precision antimicrobial peptide design. Preprint at bioRxiv https://doi.org/10.1101/2020.10.02.324087 (2021).

  184. Shi, C. et al. GraphAF: a flow-based autoregressive model for molecular graph generation. In Proc. 8th International Conference on Learning Representations, ICLR 2020 (OpenReview.net, 2020).

  185. Anand, N. & Achim, T. Protein structure and sequence generation with equivariant denoising diffusion probabilistic models. Preprint at https://doi.org/10.48550/arXiv.2205.15019 (2022).

  186. Coin, I., Beyermann, M. & Bienert, M. Solid-phase peptide synthesis: from standard procedures to the synthesis of difficult sequences. Nat. Protoc. 2, 3247–3256 (2007).

    Article  CAS  PubMed  Google Scholar 

  187. Mueller, L. K., Baumruck, A. C., Zhdanova, H. & Tietze, A. A. Challenges and perspectives in chemical synthesis of highly hydrophobic peptides. Front Bioeng. Biotechnol. 8, 162 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  188. Isidro-Llobet, A. et al. Sustainability challenges in peptide synthesis and purification: from r&d to production. J. Org. Chem. 84, 4615–4628 (2019).

    Article  CAS  PubMed  Google Scholar 

  189. Conchillo-Solé, O. et al. AGGRESCAN: a server for the prediction and evaluation of ‘hot spots’ of aggregation in polypeptides. BMC Bioinform. 8, 65 (2007).

    Article  Google Scholar 

  190. Fernandez-Escamilla, A.-M., Rousseau, F., Schymkowitz, J. & Serrano, L. Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins. Nat. Biotechnol. 22, 1302–1306 (2004).

    Article  CAS  PubMed  Google Scholar 

  191. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B. & Hochreiter, S. GANs trained by a two time-scale update rule converge to a local nash equilibrium. In Proc. 31st International Conference on Neural Information Processing Systems, 6629–6640 (Curran Assoc., 2017).

  192. Preuer, K., Renz, P., Unterthiner, T., Hochreiter, S. & Klambauer, G. Fréchet chemnet distance: a metric for generative models for molecules in drug discovery. J. Chem. Inf. Model. 58, 1736–1741 (2018).

    Article  CAS  PubMed  Google Scholar 

  193. Ribeiro, M. T., Singh, S. & Guestrin, C. ‘Why should I trust you?’ explaining the predictions of any classifier. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144 (ACM, 2016).

  194. Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. In Proc. 34th International Conference on Machine Learning, Vol. 70, 3145–3153 (JMLR.org, 2017).

  195. Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In Proc. 34th International Conference on Machine Learning, Vol. 70, 3319–3328 (JMLR.org, 2017).

  196. Jiménez-Luna, J., Grisoni, F. & Schneider, G. Drug discovery with explainable artificial intelligence. Nat. Mach. Intell. 2, 573–584 (2020).

    Article  Google Scholar 

  197. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Proc. 31st International Conference on Neural Information Processing Systems, 4768–4777 (Curran Assoc., 2017).

  198. Yuan, H., Yu, H., Gui, S. & Ji, S. Explainability in graph neural networks: a taxonomic survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 5782–5799 (2023).

    PubMed  Google Scholar 

  199. Farahani, A., Voghoei, S., Rasheed, K. & Arabnia, H. R. In Advances in Data Science and Information Engineering. Transactions on Computational Science and Computational Intelligence (eds Stahlbock, R. et al.) https://doi.org/10.1007/978-3-030-71704-9_65 (Springer, 2021).

  200. Reffuveille, F., de la Fuente-Núñez, C., Mansour, S. & Hancock, R. E. W. A broad-spectrum antibiofilm peptide enhances antibiotic action against bacterial biofilms. Antimicrob. Agents Chemother. 58, 5363–5371 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  201. Karniadakis, G. E. et al. Physics-informed machine learning. Nat. Rev. Phys. 3, 422–440 (2021).

    Article  Google Scholar 

  202. Doerr, S. et al. TorchMD: a deep learning framework for molecular simulations. J. Chem. Theory Comput. 17, 2355–2363 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  203. Husic, B. E. et al. Coarse graining molecular dynamics with graph neural networks. J. Chem. Phys. 153, 194101 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  204. Omar, S. I., Keasar, C., Ben-Sasson, A. J. & Haber, E. Protein design using physics informed neural networks. Biomolecules 13, 457 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  205. Ren, P. et al. A comprehensive survey of neural architecture search. ACM Comput. Surv. 54, 1–34 (2022).

    Google Scholar 

  206. He, X., Zhao, K. & Chu, X. AutoML: a survey of the state-of-the-art. Knowl. Based Syst. 212, 106622 (2021).

    Article  Google Scholar 

  207. Valeri, J. A. et al. BioAutoMATED: an end-to-end automated machine learning tool for explanation and design of biological sequences. Cell Syst. 14, 525–542 (2023).

    Article  CAS  PubMed  Google Scholar 

  208. Ferrazzano, L. et al. Sustainability in peptide chemistry: current synthesis and purification technologies and future challenges. Green. Chem. 24, 975–1020 (2022).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

C.d.l.F.-N. holds a Presidential Professorship at the University of Pennsylvania, is a recipient of the Langer Prize by the AIChE Foundation, and acknowledges funding from the IADR Innovation in Oral Care Award, the Procter & Gamble Company, United Therapeutics, a BBRF Young Investigator Grant, the Nemirovsky Prize, Penn Health-Tech Accelerator Award, the Dean’s Innovation Fund from the Perelman School of Medicine at the University of Pennsylvania, the National Institute of General Medical Sciences of the US National Institutes of Health (NIH) under award number R35GM138201, and the Defense Threat Reduction Agency (DTRA; HDTRA11810041, HDTRA1-21-1-0014, and HDTRA1-23-1-0001). We thank K. Pepper for editing the manuscript and de la Fuente Lab members for discussions. F. Wong was supported by the National Institute of Allergy and Infectious Diseases of the NIH under award no. K25AI168451. J.J.C. was supported by the Defense Threat Reduction Agency (grant no. HDTRA12210032), the NIH (grant no. R01-AI146194), and the Broad Institute of MIT and Harvard. This work is part of the Antibiotics-AI Project, which is directed by J.J.C. and supported by the Audacious Project, Flu Lab, LLC, the Sea Grape Foundation, Rosamund Zander and Hansjorg Wyss for the Wyss Foundation, and an anonymous donor.

Author information

Authors and Affiliations

Authors

Contributions

F. Wan and F. Wong researched and wrote the first manuscript draft. J.J.C. and C.d.l.F.-N. supervised the work. All authors contributed to writing and editing the manuscript.

Corresponding authors

Correspondence to James J. Collins or Cesar de la Fuente-Nunez.

Ethics declarations

Competing interests

J.J.C. is scientific co-founder and scientific advisory board chair of EnBiotix, an antibiotic drug discovery company, and Phare Bio, a non-profit venture focused on antibiotic drug development. C.d.l.F.-N. provides consulting services to Invaio Sciences and is a member of the Scientific Advisory Boards of Nowture S.L. and Phare Bio. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Bioengineering thanks Carlos Brizuela, Jun Wang and Monique van Hoek for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wan, F., Wong, F., Collins, J.J. et al. Machine learning for antimicrobial peptide identification and design. Nat Rev Bioeng (2024). https://doi.org/10.1038/s44222-024-00152-x

Download citation

  • Published:

  • DOI: https://doi.org/10.1038/s44222-024-00152-x

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research