Article | Published:

Dereplication of peptidic natural products through database search of mass spectra

Nature Chemical Biology volume 13, pages 3037 (2017) | Download Citation

Abstract

Peptidic natural products (PNPs) are widely used compounds that include many antibiotics and a variety of other bioactive peptides. Although recent breakthroughs in PNP discovery raised the challenge of developing new algorithms for their analysis, identification of PNPs via database search of tandem mass spectra remains an open problem. To address this problem, natural product researchers use dereplication strategies that identify known PNPs and lead to the discovery of new ones, even in cases when the reference spectra are not present in existing spectral libraries. DEREPLICATOR is a new dereplication algorithm that enables high-throughput PNP identification and that is compatible with large-scale mass-spectrometry-based screening platforms for natural product discovery. After searching nearly one hundred million tandem mass spectra in the Global Natural Products Social (GNPS) molecular networking infrastructure, DEREPLICATOR identified an order of magnitude more PNPs (and their new variants) than any previous dereplication efforts.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    & Drug discovery and natural products: end of an era or an endless frontier? Science 325, 161–165 (2009).

  2. 2.

    & Antibiotics for emerging pathogens. Science 325, 1089–1093 (2009).

  3. 3.

    et al. A new antibiotic kills pathogens without detectable resistance. Nature 517, 455–459 (2015).

  4. 4.

    , & The re-emergence of natural products for drug discovery in the genomics era. Nat. Rev. Drug Discov. 14, 111–129 (2015).

  5. 5.

    & Small molecules from the human microbiota. Science 349, 1254766 (2015).

  6. 6.

    & Computational approaches to natural product discovery. Nat. Chem. Biol. 11, 639–648 (2015).

  7. 7.

    A chemocentric view of the natural product inventory. Nat. Chem. Biol. 11, 620–624 (2015).

  8. 8.

    et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016).

  9. 9.

    & Using fragmentation trees and mass spectral trees for identifying unknown compounds in metabolomics. Trends Analyt. Chem. 69, 52–61 (2015).

  10. 10.

    & Dereplication, sequencing and identification of peptidic natural products: from genome mining to peptidogenomics to spectral networks. Nat. Prod. Rep. 33, 73–86 (2016).

  11. 11.

    , & Modular peptide synthetases involved in nonribosomal peptide synthesis. Chem. Rev. 97, 2651–2674 (1997).

  12. 12.

    et al. Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature. Nat. Prod. Rep. 30, 108–160 (2013).

  13. 13.

    et al. Automated genome mining of ribosomal peptide natural products. ACS Chem. Biol. 9, 1545–1551 (2014).

  14. 14.

    et al. METLIN: a metabolite mass spectral database. Ther. Drug Monit. 27, 747–751 (2005).

  15. 15.

    et al. Molecular networking as a dereplication strategy. J. Nat. Prod. 76, 1686–1699 (2013).

  16. 16.

    et al. Discovery and development of first in class antifungal caspofungin (CANCIDAS®)—a case study. Nat. Prod. Rep. 31, 15–34 (2014).

  17. 17.

    , & Antimarin database. University of Canterbury; Christchurch, New Zealand: University of Gottingen; Gottingen, Germany, (2007).

  18. 18.

    , & An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5, 976–989 (1994).

  19. 19.

    , , , & Identification of post-translational modifications by blind search of mass spectra. Nat. Biotechnol. 23, 1562–1567 (2005).

  20. 20.

    . et al. Dereplication and de novo sequencing of nonribosomal peptides. Nat. Methods 6, 596–599 (2009).

  21. 21.

    et al. Dereplicating nonribosomal peptides using an informatic search algorithm for natural products (iSNAP) discovery. Proc. Natl. Acad. Sci. USA 109, 19196–19201 (2012).

  22. 22.

    , , & Protein identification by spectral networks analysis. Proc. Natl. Acad. Sci. USA 104, 6140–6145 (2007).

  23. 23.

    Spectral networks: a new approach to de novo discovery of protein sequences and posttranslational modifications. Biotechniques 42, 687–691 (2007).

  24. 24.

    et al. Mass spectral molecular networking of living microbial colonies. Proc. Natl. Acad. Sci. USA 109, E1743–E1752 (2012).

  25. 25.

    et al. Multiplex de novo sequencing of peptide antibiotics. J. Comput. Biol. 18, 1371–1381 (2011).

  26. 26.

    , & A new approach to evaluating statistical significance of spectral identifications. J. Proteome Res. 12, 1560–1568 (2013).

  27. 27.

    et al. MS/MS-based networking and peptidogenomics guided genome mining revealed the stenothricin gene cluster in Streptomyces roseosporus. J. Antibiot. (Tokyo) 67, 99–104 (2014).

  28. 28.

    & MS-GF+ makes progress towards a universal database search tool for proteomics. Nat. Commun. 5, 5277–5286 (2014).

  29. 29.

    et al. Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species. Chem. Biol. 22, 460–471 (2015).

  30. 30.

    , , , & Interspecies interactions stimulate diversification of the Streptomyces coelicolor secreted metabolome. MBio 4, e00459–13 (2013).

  31. 31.

    & Comparative genomics reveals evidence of marine adaptation in Salinispora species. BMC Genomics 13, 86 (2012).

  32. 32.

    et al. Minimum information about a biosynthetic gene cluster. Nat. Chem. Biol. 11, 625–631 (2015).

  33. 33.

    , , , & Insights into naturally minimised Streptomyces albus J1074 genome. BMC Genomics 15, 97 (2014).

  34. 34.

    et al. Surugamides A-E, cyclic octapeptides with four D-amino acid residues, from a marine streptomyces sp.: LC-MS-aided inspection of partial hydrolysates for the distinction of D- and L-amino acid residues in the sequence. J. Org. Chem. 78, 6746–6750 (2013).

  35. 35.

    et al. Champacyclin, a new cyclic octapeptide from Streptomyces strain C42 isolated from the Baltic Sea. Mar. Drugs 11, 4834–4857 (2013).

  36. 36.

    et al. A mass spectrometry-guided genome mining approach for natural product peptidogenomics. Nat. Chem. Biol. 7, 794–802 (2011).

  37. 37.

    et al. Molecular cartography of the human skin surface in 3D. Proc. Natl. Acad. Sci. USA 112, E2120–E2129 (2015).

  38. 38.

    , & Illuminating the dark matter in metabolomics. Proc. Natl. Acad. Sci. USA 112, 12549–12550 (2015).

  39. 39.

    et al. Sequencing of bacitracin A and related minor components by liquid chromatography/electrospray ionization ion trap tandem mass spectrometry. Rapid Commun. Mass Spectrom. 17, 1366–1379 (2003).

  40. 40.

    et al. Structure determination of tolaasin, an extracellular lipodepsipeptide produced by the mushroom pathogen, Pseudomonas tolaasii Paine. J. Am. Chem. Soc. 113, 2621–2627 (1991).

  41. 41.

    et al. Tolaasins A–E, five new lipodepsipeptides produced by Pseudomonas tolaasii. J. Nat. Prod. 67, 811–816 (2004).

  42. 42.

    et al. Microbial competition between Bacillus subtilis and Staphylococcus aureus monitored by imaging mass spectrometry. Microbiology 157, 2485–2492 (2011).

  43. 43.

    et al. Revised structure of mycosubtilin, a peptidolipid antibiotic from Bacillus subtilis. J. Antibiot. (Tokyo) 39, 636–641 (1986).

  44. 44.

    , , , & 134. Mitteilung. Stenothricin, ein neuer Hemmstoff der bakteriellen Zellwandsynthese (Metabolic products of microorganisms. 134. Stenothricin, a new inhibitor of the bacterial cell wall synthesis.). Arch. Microbiol. 99, 307–321 (1974).

  45. 45.

    , , , & The genes degQ, pps, and lpa-8 (sfp) are responsible for conversion of Bacillus subtilis 168 to plipastatin production. Antimicrob. Agents Chemother. 43, 2183–2192 (1999).

  46. 46.

    , , , & Low-energy tandem mass spectra of the cyclic depipeptide valinomycin—a comparison with four-sector tandem mass spectra. J. Mass Spectrom. 28, 574–576 (2005).

  47. 47.

    , , , & A proteomics approach to discovering natural products and their biosynthetic pathways. Nat. Biotechnol. 27, 951–956 (2009).

  48. 48.

    et al. Massetolides A-H, antimycobacterial cyclic depsipeptides produced by two pseudomonads isolated from marine habitats. J. Nat. Prod. 60, 223–229 (1997).

  49. 49.

    et al. Cyclic lipodepsipeptides produced by Pseudomonas spp. naturally present in raw milk induce inhibitory effects on microbiological inhibitor assays for antibiotic residue screening. PLoS One 9, e98266 (2014).

  50. 50.

    , , , & Kurstakins: a new class of lipopeptides isolated from Bacillus thuringiensis. J. Nat. Prod. 63, 1492–1496 (2000).

  51. 51.

    Predicting intensity ranks of peptide fragment ions. J. Proteome Res. 8, 2226–2240 (2009).

  52. 52.

    & PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal. Chem. 77, 964–973 (2005).

  53. 53.

    A ranking-based scoring function for peptide-spectrum matches. J. Proteome Res. 8, 2241–2252 (2009).

  54. 54.

    , , & Target-decoy approach and false discovery rate: when things may go wrong. J. Am. Soc. Mass Spectrom. 22, 1111–1120 (2011).

  55. 55.

    , & Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases. J. Proteome Res. 7, 3354–3363 (2008).

  56. 56.

    & Estimation of particle transmission by random sampling. in Handbook of Mathematical Functions Vol. 12 (ed. Abramowitz, M.) 27–30 (National Bureau of Standards, 1951).

  57. 57.

    & Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).

  58. 58.

    et al. NRPSpredictor2–a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res. 39, W362–W367 (2011).

Download references

Acknowledgements

We thank M. Wang and N. Bandeira for insightful suggestions on using molecular networking and spectral library search, and M. Medema for guidelines on running antiSMASH. The work of H.M., P.D. and P.A.P. was supported by the US National Institutes of Health (grant 2-P41-GM103484). P.D. is supported by GM097509. A.G., A.M. and P.A.P. were supported by Russian Science Foundation (grant 14-50-00069).

Author information

Affiliations

  1. Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California, USA.

    • Hosein Mohimani
    •  & Pavel A Pevzner
  2. Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia.

    • Alexey Gurevich
    • , Alla Mikheenko
    •  & Pavel A Pevzner
  3. Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, USA.

    • Neha Garg
    • , Louis-Felix Nothias
    •  & Pieter C Dorrestein
  4. Laboratory of Aquatic Natural Products Chemistry, School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan.

    • Akihiro Ninomiya
    •  & Kentaro Takada
  5. Department of Pharmacology, University of California, San Diego, La Jolla, California, USA.

    • Pieter C Dorrestein

Authors

  1. Search for Hosein Mohimani in:

  2. Search for Alexey Gurevich in:

  3. Search for Alla Mikheenko in:

  4. Search for Neha Garg in:

  5. Search for Louis-Felix Nothias in:

  6. Search for Akihiro Ninomiya in:

  7. Search for Kentaro Takada in:

  8. Search for Pieter C Dorrestein in:

  9. Search for Pavel A Pevzner in:

Contributions

H.M. and A.G. implemented DEREPLICATOR algorithm. H.M., A.G. and A.M. designed the webserver. N.G. and L.-F.N. collected and analyzed mass spectrometry data and conducted SILAC experiments. A.N. and K.T. purified standard surugamide. P.C.D. and P.A.P. designed and directed the work. H.M. and P.A.P. wrote the manuscript.

Competing interests

P.A.P. has an equity interest in Digital Proteomics, LLC, a company that may potentially benefit from the research results. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies.

Corresponding author

Correspondence to Pavel A Pevzner.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Results, Supplementary Tables 1–7, Supplementary Figures 1–12 and Supplementary Note.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nchembio.2219