Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Increased diversity of peptidic natural products revealed by modification-tolerant database search of mass spectra

Abstract

Peptidic natural products (PNPs) include many antibiotics and other bioactive compounds. While the recent launch of the Global Natural Products Social (GNPS) molecular networking infrastructure is transforming PNP discovery into a high-throughput technology, PNP identification algorithms are needed to realize the potential of the GNPS project. GNPS relies on the assumption that each connected component of a molecular network (representing related metabolites) illuminates the ‘dark matter of metabolomics’ as long as it contains a known metabolite present in a database. We reveal a surprising diversity of PNPs produced by related bacteria and show that, contrary to the ‘comparative metabolomics’ assumption, two related bacteria are unlikely to produce identical PNPs (even though they are likely to produce similar PNPs). Since this observation undermines the utility of GNPS, we developed a PNP identification tool, VarQuest, that illuminates the connected components in a molecular network even if they do not contain known PNPs and only contain their variants. VarQuest reveals an order of magnitude more PNP variants than all previous PNP discovery efforts and demonstrates that GNPS already contains spectra from 41% of the currently known PNP families. The enormous diversity of PNPs suggests that biosynthetic gene clusters in various microorganisms constantly evolve to generate a unique spectrum of PNP variants that differ from PNPs in other species.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Network-based and network-independent strategies for variable PNP identification.
Fig. 2: VarQuest pipeline.
Fig. 3: Peptide network of the cyclosporin family in the PNPdatabase extended by the newly identified cyclosporin variants.

Similar content being viewed by others

References

  1. Ling, L. L. et al. A new antibiotic kills pathogens without detectable resistance. Nature 517, 455–459 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Wang, M. et al. Sharing and community curation of mass spectrometry data with global natural products social molecular networking. Nat. Biotechnol. 34, 828–837 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Marahiel, M. A., Stachelhaus, T. & Mootz, H. D. Modular peptide synthetases involved in nonribosomal peptide synthesis. Chem. Rev. 97, 2651–2674 (1997).

    Article  CAS  PubMed  Google Scholar 

  4. Arnison, P. G. et al. Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature. Nat. Prod. Rep. 30, 108–160 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Stachelhaus, T., Mootz, H. D., Bergendahl, V. & Marahiel, M. A. Peptide bond formation in nonribosomal peptide biosynthesis. Catalytic role of the condensation domain. J. Biol. Chem. 273, 22773–22781 (1998).

    Article  CAS  PubMed  Google Scholar 

  6. Von Dohren, H., Dieckmann, R. & Pavela-Vrancic, M. The nonribosomal code. Chem. Biol. 6, R273–R279 (1999).

    Article  CAS  PubMed  Google Scholar 

  7. Mohimani, H. et al. Automated genome mining of ribosomal peptide natural products. Acs. Chem. Biol. 9, 1545–1551 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Ng, J. et al. Dereplication and de novo sequencing of nonribosomal peptides. Nat. Methods 6, 596–599 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Ibrahim, A. et al. Dereplicating nonribosomal peptides using an informatic search algorithm for natural products (iSNAP) discovery. Proc. Natl Acad. Sci. USA 109, 19196–19201 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Mohimani, H. & Pevzner, P. A. Dereplication, sequencing and identification of peptidic natural products: from genome mining to peptidogenomics to spectral networks. Nat. Prod. Rep. 33, 73–86 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Mohimani, H. et al. Dereplication of peptidic natural products through database search of mass spectra. Nat. Chem. Biol. 13, 30–37 (2017).

    Article  CAS  PubMed  Google Scholar 

  12. Pevzner, P. A., Mulyukov, Z., Dancik, V. & Tang, C. L. Efficiency of database search for identification of mutated and modified proteins via mass spectrometry. Genome Res. 11, 290–299 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Tsur, D., Tanner, S., Zandi, E., Bafna, V. & Pevzner, P. A. Identification of post-translational modifications by blind search of mass spectra. Nat. Biotechnol. 23, 1562–1567 (2005).

    Article  CAS  PubMed  Google Scholar 

  14. Tanner, S. et al. InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. Anal. Chem. 77, 4626–4639 (2005).

    Article  CAS  PubMed  Google Scholar 

  15. Kong, A. T., Leprevost, F. V., Avtonomov, D. M., Mellacheruvu, D. & Nesvizhskii, A. I. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat. Methods 14, 513–520 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Balkovec, J. M. et al. Discovery and development of first in class antifungal caspofungin (CANCIDAS®)—a case study. Nat. Prod. Rep. 31, 15–34 (2014).

    CAS  PubMed  Google Scholar 

  17. Okano, A., Isley, N. & Boger, D. L. Peripheral modifications of vancomycin with added synergistic mechanisms of action provide durable and potent antibiotics. Proc. Natl Acad. Sci. USA 114, 5052–5061 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Mohimani, H. et al. Multiplex de novo sequencing of peptide antibiotics. J. Comput. Biol. 18, 1371–1381 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Bandeira, N. Spectral networks: a new approach to de novo discovery of protein sequences and posttranslational modifications. Biotechniques 42, 687–695 (2007).

    Article  CAS  PubMed  Google Scholar 

  20. Navarro, G. et al. Image-based 384-well high-throughput screening method for the discovery of skyllamycins A to C as biofilm inhibitors and inducers of biofilm detachment in Pseudomonas aeruginosa. Antimicrob. Agents Ch. 58, 1092–1099 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Yates, J. R., Eng, J. K., McCormack, A. L. & Schieltz, D. Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Anal. Chem. 67, 1426–1436 (1995).

    Article  CAS  PubMed  Google Scholar 

  22. Pevzner, P. A., Dancik, V. & Tang, C. L. Mutation-tolerant protein identification by mass spectrometry. J. Comput. Biol. 7, 777–787 (2000).

    Article  CAS  PubMed  Google Scholar 

  23. Na, S., Bandeira, N. & Paek, E. Fast multi-blind modification search through tandem mass spectrometry. Mol. Cell. Proteom. 11, M111.010199 (2012).

    Article  Google Scholar 

  24. Mohimani, H., Kim, S. & Pevzner, P. A. A new approach to evaluating statistical significance of spectral identifications. J. Proteome Res. 12, 1560–1568 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Nguyen, D. D. et al. Indexing the Pseudomonas specialized metabolome enabled the discovery of poaeamide B and the bananamides. Nat. Microbiol. 2, 16197 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Duncan, K. R. et al. Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species. Chem. Biol. 22, 460–471 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Luzzatto-Knaan, T. et al. Digitizing mass spectrometry data to explore the chemical diversity and distribution of marine cyanobacteria and algae. eLife 6, e24214 (2017).

  28. Blunt, J., Munro, M. & Laatsch, H. AntiMarin Database (Univ. Canterbury, Christchurch, and Univ. Gottingen, Gottingen, 2007); https://www.scienceopen.com/document?vid=03a1a98e-434c-4255-a287-5a900f59d024

  29. Gozalbes, R. & Pineda-Lucena, A. Small molecule databases and chemical descriptors useful in chemoinformatics: an overview. Comb. Chem. High T. Scr. 14, 548–458 (2011).

    Article  CAS  PubMed  Google Scholar 

  30. Medema, M. H. et al. Minimum information about a biosynthetic gene cluster. Nat. Chem. Biol. 11, 625–631 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Lucas, X. et al. StreptomeDB: a resource for natural compounds isolated from Streptomyces species. Nucleic Acids Res. 41, D1130–D1136 (2013).

    Article  CAS  PubMed  Google Scholar 

  32. Challis, G. L. & Naismith, J. H. Structural aspects of non-ribosomal peptide biosynthesis. Curr. Opin. Struc. Biol. 14, 748–756 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Schmidt, E. W. The hidden diversity of ribosomal peptide natural products. BMC Biol. 8, 83 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Hadjithomas, M. et al. IMG-ABC: new features for bacterial secondary metabolism analysis and targeted biosynthetic gene cluster discovery in thousands of microbial genomes. Nucleic Acids Res. 45, D560–D565 (2017).

    Article  CAS  PubMed  Google Scholar 

  35. Medema, M. H. et al. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res. 39, W339–W346 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Gerard, J. et al. Massetolides A-H, antimycobacterial cyclic depsipeptides produced by two pseudomonads isolated from marine habitats. J. Nat. Prod. 60, 223–229 (1997).

    Article  CAS  PubMed  Google Scholar 

  37. Takada, K. et al. Surugamides A-E, cyclic octapeptides with four D-amino acid residues, from a marine Streptomyces sp.: LC-MS-aided inspection of partial hydrolysates for the distinction of D - and L -amino acid residues in the sequence. J. Org. Chem. 78, 6746–6750 (2013).

    Article  CAS  PubMed  Google Scholar 

  38. Kodani, S., Sato, K., Hemmi, H. & Ohnish-Kameyama, M. Isolation and structural determination of a new hydrophobic peptide venepeptide from Streptomyces venezuelae. J. Antibiot. 67, 839–842 (2014).

    Article  CAS  Google Scholar 

  39. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

    Article  CAS  PubMed  Google Scholar 

  40. Watrous, J. et al. Mass spectral molecular networking of living microbial colonies. Proc. Natl Acad. Sci. USA 109, E1743–E1752 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Mukherjee, S. et al. 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life. Nat. Biotechnol. 35, 676–683 (2017).

  42. Mohimani, H. et al. Cycloquest: identification of cyclopeptides via database search of their mass spectra against genome databases. J. Proteome Res. 10, 4505–4512 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Mohimani, H. et al. NRPquest: coupling mass spectrometry and genome mining for nonribosomal peptide discovery. J. Nat. Prod. 77, 1902–1909 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Rottig, M. et al. NRPSpredictor2—a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res. 39, W362–W367 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Da Silva, R. R., Dorrestein, P. C. & Quinn, R. A. Illuminating the dark matter in metabolomics. Proc. Natl Acad. Sci. USA 112, 12549–12550 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Elias, J. E. & Gygi, S. P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).

    Article  CAS  PubMed  Google Scholar 

  47. Djoumbou Feunang, Y. et al. ClassyFire: automated chemical classification with a comprehensive, computable taxonomy. J. Cheminformatics 8, 61 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  48. Smith, C. A. et al. METLIN: a metabolite mass spectral database. Ther. Drug. Monit. 27, 747–751 (2005).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank K. Vyatkina for fruitful discussions and A. Prjibelski for help with manuscript preparation. The work of A.G., A.M., A.S., A.K. and P.A.P. was supported by the Russian Science Foundation (grant 14-50-00069). The work of H.M. and P.A.P. was supported by the US National Institutes of Health (grant 2-P41-GM103484).

Author information

Authors and Affiliations

Authors

Contributions

A.G. implemented the VarQuest algorithm. A.S. and A.K. improved and sped up the DEREPLICATOR software. A.G., A.M. and H.M. designed the webserver. A.G. and A.M. did the VarQuest benchmarking. H.M. and P.A.P. designed and directed the work. A.G., H.M. and P.A.P. wrote the manuscript.

Corresponding author

Correspondence to Pavel A. Pevzner.

Ethics declarations

Competing interests

P.A.P. has an equity interest in Digital Proteomics—a company that may potentially benefit from the research results. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Tables 1–18, Supplementary Figures 1–9 and Supplementary References.

Life Sciences Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gurevich, A., Mikheenko, A., Shlemov, A. et al. Increased diversity of peptidic natural products revealed by modification-tolerant database search of mass spectra. Nat Microbiol 3, 319–327 (2018). https://doi.org/10.1038/s41564-017-0094-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41564-017-0094-2

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research