Dereplication of peptidic natural products through database search of mass spectra

Abstract

Peptidic natural products (PNPs) are widely used compounds that include many antibiotics and a variety of other bioactive peptides. Although recent breakthroughs in PNP discovery raised the challenge of developing new algorithms for their analysis, identification of PNPs via database search of tandem mass spectra remains an open problem. To address this problem, natural product researchers use dereplication strategies that identify known PNPs and lead to the discovery of new ones, even in cases when the reference spectra are not present in existing spectral libraries. DEREPLICATOR is a new dereplication algorithm that enables high-throughput PNP identification and that is compatible with large-scale mass-spectrometry-based screening platforms for natural product discovery. After searching nearly one hundred million tandem mass spectra in the Global Natural Products Social (GNPS) molecular networking infrastructure, DEREPLICATOR identified an order of magnitude more PNPs (and their new variants) than any previous dereplication efforts.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: DEREPLICATOR pipeline.
Figure 2: Number of PSMs and peptides identified by DEREPLICATOR.
Figure 3: Number of peptides identified by DEREPLICATOR in SpectraHigh data set.
Figure 4: Spectral networks illustrating the results of the SILAC experiment.
Figure 5: Generating theoretical spectra and computing P values of PSMs formed by PNPs with various architectures.

References

  1. 1

    Li, J.W. & Vederas, J.C. Drug discovery and natural products: end of an era or an endless frontier? Science 325, 161–165 (2009).

    Article  CAS  Google Scholar 

  2. 2

    Fischbach, M.A. & Walsh, C.T. Antibiotics for emerging pathogens. Science 325, 1089–1093 (2009).

    CAS  Article  Google Scholar 

  3. 3

    Ling, L.L. et al. A new antibiotic kills pathogens without detectable resistance. Nature 517, 455–459 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. 4

    Harvey, A.L., Edrada-Ebel, R. & Quinn, R.J. The re-emergence of natural products for drug discovery in the genomics era. Nat. Rev. Drug Discov. 14, 111–129 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. 5

    Donia, M.S. & Fischbach, M.A. Small molecules from the human microbiota. Science 349, 1254766 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. 6

    Medema, M.H. & Fischbach, M.A. Computational approaches to natural product discovery. Nat. Chem. Biol. 11, 639–648 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. 7

    Walsh, C.T. A chemocentric view of the natural product inventory. Nat. Chem. Biol. 11, 620–624 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. 8

    Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. 9

    Vaniya, A. & Fiehn, O. Using fragmentation trees and mass spectral trees for identifying unknown compounds in metabolomics. Trends Analyt. Chem. 69, 52–61 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. 10

    Mohimani, H. & Pevzner, P.A. Dereplication, sequencing and identification of peptidic natural products: from genome mining to peptidogenomics to spectral networks. Nat. Prod. Rep. 33, 73–86 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. 11

    Marahiel, M.A., Stachelhaus, T. & Mootz, H.D. Modular peptide synthetases involved in nonribosomal peptide synthesis. Chem. Rev. 97, 2651–2674 (1997).

    Article  CAS  Google Scholar 

  12. 12

    Arnison, P.G. et al. Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature. Nat. Prod. Rep. 30, 108–160 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. 13

    Mohimani, H. et al. Automated genome mining of ribosomal peptide natural products. ACS Chem. Biol. 9, 1545–1551 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. 14

    Smith, C.A. et al. METLIN: a metabolite mass spectral database. Ther. Drug Monit. 27, 747–751 (2005).

    Article  CAS  Google Scholar 

  15. 15

    Yang, J.Y. et al. Molecular networking as a dereplication strategy. J. Nat. Prod. 76, 1686–1699 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. 16

    Balkovec, J.M. et al. Discovery and development of first in class antifungal caspofungin (CANCIDAS®)—a case study. Nat. Prod. Rep. 31, 15–34 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. 17

    Blunt, J., Munro, M. & Laatsch, H. Antimarin database. University of Canterbury; Christchurch, New Zealand: University of Gottingen; Gottingen, Germany, (2007).

  18. 18

    Eng, J.K., McCormack, A.L. & Yates, J.R. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5, 976–989 (1994).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. 19

    Tsur, D., Tanner, S., Zandi, E., Bafna, V. & Pevzner, P.A. Identification of post-translational modifications by blind search of mass spectra. Nat. Biotechnol. 23, 1562–1567 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. 20

    Ng, J . et al. Dereplication and de novo sequencing of nonribosomal peptides. Nat. Methods 6, 596–599 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. 21

    Ibrahim, A. et al. Dereplicating nonribosomal peptides using an informatic search algorithm for natural products (iSNAP) discovery. Proc. Natl. Acad. Sci. USA 109, 19196–19201 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  22. 22

    Bandeira, N., Tsur, D., Frank, A. & Pevzner, P.A. Protein identification by spectral networks analysis. Proc. Natl. Acad. Sci. USA 104, 6140–6145 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. 23

    Bandeira, N. Spectral networks: a new approach to de novo discovery of protein sequences and posttranslational modifications. Biotechniques 42, 687–691 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. 24

    Watrous, J. et al. Mass spectral molecular networking of living microbial colonies. Proc. Natl. Acad. Sci. USA 109, E1743–E1752 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  25. 25

    Mohimani, H. et al. Multiplex de novo sequencing of peptide antibiotics. J. Comput. Biol. 18, 1371–1381 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. 26

    Mohimani, H., Kim, S. & Pevzner, P.A. A new approach to evaluating statistical significance of spectral identifications. J. Proteome Res. 12, 1560–1568 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. 27

    Liu, W.T. et al. MS/MS-based networking and peptidogenomics guided genome mining revealed the stenothricin gene cluster in Streptomyces roseosporus. J. Antibiot. (Tokyo) 67, 99–104 (2014).

    Article  CAS  Google Scholar 

  28. 28

    Kim, S. & Pevzner, P.A. MS-GF+ makes progress towards a universal database search tool for proteomics. Nat. Commun. 5, 5277–5286 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. 29

    Duncan, K.R. et al. Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species. Chem. Biol. 22, 460–471 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. 30

    Traxler, M.F., Watrous, J.D., Alexandrov, T., Dorrestein, P.C. & Kolter, R. Interspecies interactions stimulate diversification of the Streptomyces coelicolor secreted metabolome. MBio 4, e00459–13 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. 31

    Penn, K. & Jensen, P.R. Comparative genomics reveals evidence of marine adaptation in Salinispora species. BMC Genomics 13, 86 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  32. 32

    Medema, M.H. et al. Minimum information about a biosynthetic gene cluster. Nat. Chem. Biol. 11, 625–631 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. 33

    Zaburannyi, N., Rabyk, M., Ostash, B., Fedorenko, V. & Luzhetskyy, A. Insights into naturally minimised Streptomyces albus J1074 genome. BMC Genomics 15, 97 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  34. 34

    Takada, K. et al. Surugamides A-E, cyclic octapeptides with four D-amino acid residues, from a marine streptomyces sp.: LC-MS-aided inspection of partial hydrolysates for the distinction of D- and L-amino acid residues in the sequence. J. Org. Chem. 78, 6746–6750 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. 35

    Pesic, A. et al. Champacyclin, a new cyclic octapeptide from Streptomyces strain C42 isolated from the Baltic Sea. Mar. Drugs 11, 4834–4857 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. 36

    Kersten, R.D. et al. A mass spectrometry-guided genome mining approach for natural product peptidogenomics. Nat. Chem. Biol. 7, 794–802 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. 37

    Bouslimani, A. et al. Molecular cartography of the human skin surface in 3D. Proc. Natl. Acad. Sci. USA 112, E2120–E2129 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. 38

    da Silva, R.R., Dorrestein, P.C. & Quinn, R.A. Illuminating the dark matter in metabolomics. Proc. Natl. Acad. Sci. USA 112, 12549–12550 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. 39

    Govaerts, C. et al. Sequencing of bacitracin A and related minor components by liquid chromatography/electrospray ionization ion trap tandem mass spectrometry. Rapid Commun. Mass Spectrom. 17, 1366–1379 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. 40

    Nutkins, J.C. et al. Structure determination of tolaasin, an extracellular lipodepsipeptide produced by the mushroom pathogen, Pseudomonas tolaasii Paine. J. Am. Chem. Soc. 113, 2621–2627 (1991).

    Article  CAS  Google Scholar 

  41. 41

    Bassarello, C. et al. Tolaasins A–E, five new lipodepsipeptides produced by Pseudomonas tolaasii. J. Nat. Prod. 67, 811–816 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. 42

    Gonzalez, D.J. et al. Microbial competition between Bacillus subtilis and Staphylococcus aureus monitored by imaging mass spectrometry. Microbiology 157, 2485–2492 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. 43

    Peypoux, F. et al. Revised structure of mycosubtilin, a peptidolipid antibiotic from Bacillus subtilis. J. Antibiot. (Tokyo) 39, 636–641 (1986).

    Article  CAS  Google Scholar 

  44. 44

    Hasenböhler, A., Kneifel, H., König, W.A., Zähner, H. & Zeiler, H.J. 134. Mitteilung. Stenothricin, ein neuer Hemmstoff der bakteriellen Zellwandsynthese (Metabolic products of microorganisms. 134. Stenothricin, a new inhibitor of the bacterial cell wall synthesis.). Arch. Microbiol. 99, 307–321 (1974).

    Article  PubMed  PubMed Central  Google Scholar 

  45. 45

    Tsuge, K., Ano, T., Hirai, M., Nakamura, Y. & Shoda, M. The genes degQ, pps, and lpa-8 (sfp) are responsible for conversion of Bacillus subtilis 168 to plipastatin production. Antimicrob. Agents Chemother. 43, 2183–2192 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. 46

    Sheil, M., Kilby, G., Curtis, J., Bradley, C. & Derrick, P. Low-energy tandem mass spectra of the cyclic depipeptide valinomycin—a comparison with four-sector tandem mass spectra. J. Mass Spectrom. 28, 574–576 (2005).

    Google Scholar 

  47. 47

    Bumpus, S.B., Evans, B.S., Thomas, P.M., Ntai, I. & Kelleher, N.L. A proteomics approach to discovering natural products and their biosynthetic pathways. Nat. Biotechnol. 27, 951–956 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. 48

    Gerard, J. et al. Massetolides A-H, antimycobacterial cyclic depsipeptides produced by two pseudomonads isolated from marine habitats. J. Nat. Prod. 60, 223–229 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. 49

    Reybroeck, W. et al. Cyclic lipodepsipeptides produced by Pseudomonas spp. naturally present in raw milk induce inhibitory effects on microbiological inhibitor assays for antibiotic residue screening. PLoS One 9, e98266 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. 50

    Hathout, Y., Ho, Y.P., Ryzhov, V., Demirev, P. & Fenselau, C. Kurstakins: a new class of lipopeptides isolated from Bacillus thuringiensis. J. Nat. Prod. 63, 1492–1496 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. 51

    Frank, A.M. Predicting intensity ranks of peptide fragment ions. J. Proteome Res. 8, 2226–2240 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. 52

    Frank, A. & Pevzner, P. PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal. Chem. 77, 964–973 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. 53

    Frank, A.M. A ranking-based scoring function for peptide-spectrum matches. J. Proteome Res. 8, 2241–2252 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. 54

    Gupta, N., Bandeira, N., Keich, U. & Pevzner, P.A. Target-decoy approach and false discovery rate: when things may go wrong. J. Am. Soc. Mass Spectrom. 22, 1111–1120 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. 55

    Kim, S., Gupta, N. & Pevzner, P.A. Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases. J. Proteome Res. 7, 3354–3363 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. 56

    Kahn, H. & Harris, T. Estimation of particle transmission by random sampling. in Handbook of Mathematical Functions Vol. 12 (ed. Abramowitz, M.) 27–30 (National Bureau of Standards, 1951).

  57. 57

    Elias, J.E. & Gygi, S.P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. 58

    Röttig, M. et al. NRPSpredictor2–a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res. 39, W362–W367 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank M. Wang and N. Bandeira for insightful suggestions on using molecular networking and spectral library search, and M. Medema for guidelines on running antiSMASH. The work of H.M., P.D. and P.A.P. was supported by the US National Institutes of Health (grant 2-P41-GM103484). P.D. is supported by GM097509. A.G., A.M. and P.A.P. were supported by Russian Science Foundation (grant 14-50-00069).

Author information

Affiliations

Authors

Contributions

H.M. and A.G. implemented DEREPLICATOR algorithm. H.M., A.G. and A.M. designed the webserver. N.G. and L.-F.N. collected and analyzed mass spectrometry data and conducted SILAC experiments. A.N. and K.T. purified standard surugamide. P.C.D. and P.A.P. designed and directed the work. H.M. and P.A.P. wrote the manuscript.

Corresponding author

Correspondence to Pavel A Pevzner.

Ethics declarations

Competing interests

P.A.P. has an equity interest in Digital Proteomics, LLC, a company that may potentially benefit from the research results. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies.

Supplementary information

Supplementary Text and Figures

Supplementary Results, Supplementary Tables 1–7, Supplementary Figures 1–12 and Supplementary Note. (PDF 4114 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mohimani, H., Gurevich, A., Mikheenko, A. et al. Dereplication of peptidic natural products through database search of mass spectra. Nat Chem Biol 13, 30–37 (2017). https://doi.org/10.1038/nchembio.2219

Download citation

Further reading