Peptidic natural products (PNPs) are widely used compounds that include many antibiotics and a variety of other bioactive peptides. Although recent breakthroughs in PNP discovery raised the challenge of developing new algorithms for their analysis, identification of PNPs via database search of tandem mass spectra remains an open problem. To address this problem, natural product researchers use dereplication strategies that identify known PNPs and lead to the discovery of new ones, even in cases when the reference spectra are not present in existing spectral libraries. DEREPLICATOR is a new dereplication algorithm that enables high-throughput PNP identification and that is compatible with large-scale mass-spectrometry-based screening platforms for natural product discovery. After searching nearly one hundred million tandem mass spectra in the Global Natural Products Social (GNPS) molecular networking infrastructure, DEREPLICATOR identified an order of magnitude more PNPs (and their new variants) than any previous dereplication efforts.
Subscribe to Journal
Get full journal access for 1 year
only $15.58 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Li, J.W. & Vederas, J.C. Drug discovery and natural products: end of an era or an endless frontier? Science 325, 161–165 (2009).
Fischbach, M.A. & Walsh, C.T. Antibiotics for emerging pathogens. Science 325, 1089–1093 (2009).
Ling, L.L. et al. A new antibiotic kills pathogens without detectable resistance. Nature 517, 455–459 (2015).
Harvey, A.L., Edrada-Ebel, R. & Quinn, R.J. The re-emergence of natural products for drug discovery in the genomics era. Nat. Rev. Drug Discov. 14, 111–129 (2015).
Donia, M.S. & Fischbach, M.A. Small molecules from the human microbiota. Science 349, 1254766 (2015).
Medema, M.H. & Fischbach, M.A. Computational approaches to natural product discovery. Nat. Chem. Biol. 11, 639–648 (2015).
Walsh, C.T. A chemocentric view of the natural product inventory. Nat. Chem. Biol. 11, 620–624 (2015).
Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016).
Vaniya, A. & Fiehn, O. Using fragmentation trees and mass spectral trees for identifying unknown compounds in metabolomics. Trends Analyt. Chem. 69, 52–61 (2015).
Mohimani, H. & Pevzner, P.A. Dereplication, sequencing and identification of peptidic natural products: from genome mining to peptidogenomics to spectral networks. Nat. Prod. Rep. 33, 73–86 (2016).
Marahiel, M.A., Stachelhaus, T. & Mootz, H.D. Modular peptide synthetases involved in nonribosomal peptide synthesis. Chem. Rev. 97, 2651–2674 (1997).
Arnison, P.G. et al. Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature. Nat. Prod. Rep. 30, 108–160 (2013).
Mohimani, H. et al. Automated genome mining of ribosomal peptide natural products. ACS Chem. Biol. 9, 1545–1551 (2014).
Smith, C.A. et al. METLIN: a metabolite mass spectral database. Ther. Drug Monit. 27, 747–751 (2005).
Yang, J.Y. et al. Molecular networking as a dereplication strategy. J. Nat. Prod. 76, 1686–1699 (2013).
Balkovec, J.M. et al. Discovery and development of first in class antifungal caspofungin (CANCIDAS®)—a case study. Nat. Prod. Rep. 31, 15–34 (2014).
Blunt, J., Munro, M. & Laatsch, H. Antimarin database. University of Canterbury; Christchurch, New Zealand: University of Gottingen; Gottingen, Germany, (2007).
Eng, J.K., McCormack, A.L. & Yates, J.R. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5, 976–989 (1994).
Tsur, D., Tanner, S., Zandi, E., Bafna, V. & Pevzner, P.A. Identification of post-translational modifications by blind search of mass spectra. Nat. Biotechnol. 23, 1562–1567 (2005).
Ng, J . et al. Dereplication and de novo sequencing of nonribosomal peptides. Nat. Methods 6, 596–599 (2009).
Ibrahim, A. et al. Dereplicating nonribosomal peptides using an informatic search algorithm for natural products (iSNAP) discovery. Proc. Natl. Acad. Sci. USA 109, 19196–19201 (2012).
Bandeira, N., Tsur, D., Frank, A. & Pevzner, P.A. Protein identification by spectral networks analysis. Proc. Natl. Acad. Sci. USA 104, 6140–6145 (2007).
Bandeira, N. Spectral networks: a new approach to de novo discovery of protein sequences and posttranslational modifications. Biotechniques 42, 687–691 (2007).
Watrous, J. et al. Mass spectral molecular networking of living microbial colonies. Proc. Natl. Acad. Sci. USA 109, E1743–E1752 (2012).
Mohimani, H. et al. Multiplex de novo sequencing of peptide antibiotics. J. Comput. Biol. 18, 1371–1381 (2011).
Mohimani, H., Kim, S. & Pevzner, P.A. A new approach to evaluating statistical significance of spectral identifications. J. Proteome Res. 12, 1560–1568 (2013).
Liu, W.T. et al. MS/MS-based networking and peptidogenomics guided genome mining revealed the stenothricin gene cluster in Streptomyces roseosporus. J. Antibiot. (Tokyo) 67, 99–104 (2014).
Kim, S. & Pevzner, P.A. MS-GF+ makes progress towards a universal database search tool for proteomics. Nat. Commun. 5, 5277–5286 (2014).
Duncan, K.R. et al. Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species. Chem. Biol. 22, 460–471 (2015).
Traxler, M.F., Watrous, J.D., Alexandrov, T., Dorrestein, P.C. & Kolter, R. Interspecies interactions stimulate diversification of the Streptomyces coelicolor secreted metabolome. MBio 4, e00459–13 (2013).
Penn, K. & Jensen, P.R. Comparative genomics reveals evidence of marine adaptation in Salinispora species. BMC Genomics 13, 86 (2012).
Medema, M.H. et al. Minimum information about a biosynthetic gene cluster. Nat. Chem. Biol. 11, 625–631 (2015).
Zaburannyi, N., Rabyk, M., Ostash, B., Fedorenko, V. & Luzhetskyy, A. Insights into naturally minimised Streptomyces albus J1074 genome. BMC Genomics 15, 97 (2014).
Takada, K. et al. Surugamides A-E, cyclic octapeptides with four D-amino acid residues, from a marine streptomyces sp.: LC-MS-aided inspection of partial hydrolysates for the distinction of D- and L-amino acid residues in the sequence. J. Org. Chem. 78, 6746–6750 (2013).
Pesic, A. et al. Champacyclin, a new cyclic octapeptide from Streptomyces strain C42 isolated from the Baltic Sea. Mar. Drugs 11, 4834–4857 (2013).
Kersten, R.D. et al. A mass spectrometry-guided genome mining approach for natural product peptidogenomics. Nat. Chem. Biol. 7, 794–802 (2011).
Bouslimani, A. et al. Molecular cartography of the human skin surface in 3D. Proc. Natl. Acad. Sci. USA 112, E2120–E2129 (2015).
da Silva, R.R., Dorrestein, P.C. & Quinn, R.A. Illuminating the dark matter in metabolomics. Proc. Natl. Acad. Sci. USA 112, 12549–12550 (2015).
Govaerts, C. et al. Sequencing of bacitracin A and related minor components by liquid chromatography/electrospray ionization ion trap tandem mass spectrometry. Rapid Commun. Mass Spectrom. 17, 1366–1379 (2003).
Nutkins, J.C. et al. Structure determination of tolaasin, an extracellular lipodepsipeptide produced by the mushroom pathogen, Pseudomonas tolaasii Paine. J. Am. Chem. Soc. 113, 2621–2627 (1991).
Bassarello, C. et al. Tolaasins A–E, five new lipodepsipeptides produced by Pseudomonas tolaasii. J. Nat. Prod. 67, 811–816 (2004).
Gonzalez, D.J. et al. Microbial competition between Bacillus subtilis and Staphylococcus aureus monitored by imaging mass spectrometry. Microbiology 157, 2485–2492 (2011).
Peypoux, F. et al. Revised structure of mycosubtilin, a peptidolipid antibiotic from Bacillus subtilis. J. Antibiot. (Tokyo) 39, 636–641 (1986).
Hasenböhler, A., Kneifel, H., König, W.A., Zähner, H. & Zeiler, H.J. 134. Mitteilung. Stenothricin, ein neuer Hemmstoff der bakteriellen Zellwandsynthese (Metabolic products of microorganisms. 134. Stenothricin, a new inhibitor of the bacterial cell wall synthesis.). Arch. Microbiol. 99, 307–321 (1974).
Tsuge, K., Ano, T., Hirai, M., Nakamura, Y. & Shoda, M. The genes degQ, pps, and lpa-8 (sfp) are responsible for conversion of Bacillus subtilis 168 to plipastatin production. Antimicrob. Agents Chemother. 43, 2183–2192 (1999).
Sheil, M., Kilby, G., Curtis, J., Bradley, C. & Derrick, P. Low-energy tandem mass spectra of the cyclic depipeptide valinomycin—a comparison with four-sector tandem mass spectra. J. Mass Spectrom. 28, 574–576 (2005).
Bumpus, S.B., Evans, B.S., Thomas, P.M., Ntai, I. & Kelleher, N.L. A proteomics approach to discovering natural products and their biosynthetic pathways. Nat. Biotechnol. 27, 951–956 (2009).
Gerard, J. et al. Massetolides A-H, antimycobacterial cyclic depsipeptides produced by two pseudomonads isolated from marine habitats. J. Nat. Prod. 60, 223–229 (1997).
Reybroeck, W. et al. Cyclic lipodepsipeptides produced by Pseudomonas spp. naturally present in raw milk induce inhibitory effects on microbiological inhibitor assays for antibiotic residue screening. PLoS One 9, e98266 (2014).
Hathout, Y., Ho, Y.P., Ryzhov, V., Demirev, P. & Fenselau, C. Kurstakins: a new class of lipopeptides isolated from Bacillus thuringiensis. J. Nat. Prod. 63, 1492–1496 (2000).
Frank, A.M. Predicting intensity ranks of peptide fragment ions. J. Proteome Res. 8, 2226–2240 (2009).
Frank, A. & Pevzner, P. PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal. Chem. 77, 964–973 (2005).
Frank, A.M. A ranking-based scoring function for peptide-spectrum matches. J. Proteome Res. 8, 2241–2252 (2009).
Gupta, N., Bandeira, N., Keich, U. & Pevzner, P.A. Target-decoy approach and false discovery rate: when things may go wrong. J. Am. Soc. Mass Spectrom. 22, 1111–1120 (2011).
Kim, S., Gupta, N. & Pevzner, P.A. Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases. J. Proteome Res. 7, 3354–3363 (2008).
Kahn, H. & Harris, T. Estimation of particle transmission by random sampling. in Handbook of Mathematical Functions Vol. 12 (ed. Abramowitz, M.) 27–30 (National Bureau of Standards, 1951).
Elias, J.E. & Gygi, S.P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).
Röttig, M. et al. NRPSpredictor2–a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res. 39, W362–W367 (2011).
We thank M. Wang and N. Bandeira for insightful suggestions on using molecular networking and spectral library search, and M. Medema for guidelines on running antiSMASH. The work of H.M., P.D. and P.A.P. was supported by the US National Institutes of Health (grant 2-P41-GM103484). P.D. is supported by GM097509. A.G., A.M. and P.A.P. were supported by Russian Science Foundation (grant 14-50-00069).
P.A.P. has an equity interest in Digital Proteomics, LLC, a company that may potentially benefit from the research results. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies.
About this article
Cite this article
Mohimani, H., Gurevich, A., Mikheenko, A. et al. Dereplication of peptidic natural products through database search of mass spectra. Nat Chem Biol 13, 30–37 (2017). https://doi.org/10.1038/nchembio.2219
Cell Systems (2020)
Nature Catalysis (2020)
Computational and Structural Biotechnology Journal (2020)
Journal of Physics: Conference Series (2020)
Critical Reviews in Biotechnology (2020)