We report O-Pair Search, an approach to identify O-glycopeptides and localize O-glycosites. Using paired collision- and electron-based dissociation spectra, O-Pair Search identifies O-glycopeptides via an ion-indexed open modification search and localizes O-glycosites using graph theory and probability-based localization. O-Pair Search reduces search times more than 2,000-fold compared to current O-glycopeptide processing software, while defining O-glycosite localization confidence levels and generating more O-glycopeptide identifications. Beyond the mucin-type O-glycopeptides discussed here, O-Pair Search also accepts user-defined glycan databases, making it compatible with many types of O-glycosylation. O-Pair Search is freely available within the open-source MetaMorpheus platform at https://github.com/smith-chem-wisc/MetaMorpheus.
Subscribe to Journal
Get full journal access for 1 year
only $21.58 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The data used in this manuscript are available through the Proteome-Xchange Consortium via the PRIDE partner repository48 with the dataset identifier PXD017646 (ref. 15) and via MassIVE with identifier MSV000083070 (ref. 9). Processed data using Byonic and Protein Prospector for the urinary O-glycopeptide dataset were downloaded from ref. 8.
O-Pair Search is available in MetaMorpheus (v.0.0.307 for HCD–EThcD data and v.0.0.308 for HCD–HCD and HCD–sceHCD data), and is open source and freely available at https://github.com/smith-chem-wisc/MetaMorpheus under a permissive license. All source code was written in Microsoft C# with.NET CORE 3.1 using Visual Studio.
Abrahams, J. L. et al. Recent advances in glycoinformatic platforms for glycomics and glycoproteomics. Curr. Opin. Struct. Biol. 62, 56–69 (2020).
You, X., Qin, H. & Ye, M. Recent advances in methods for the analysis of protein O-glycosylation at proteome level. J. Sep. Sci. 41, 248–261 (2018).
Suttapitugsakul, S., Sun, F. & Wu, R. Recent advances in glycoproteomic analysis by mass spectrometry. Anal. Chem. 92, 267–291 (2020).
Riley, N. M. & Coon, J. J. The role of electron transfer dissociation in modern proteomics. Anal. Chemi. 90, 40–64 (2018).
Reily, C., Stewart, T. J., Renfrow, M. B. & Novak, J. Glycosylation in health and disease. Nat. Rev. Nephrology 15, 346–366 (2019).
Brockhausen, I. & Stanley, P. in Essentials in Glycobiology (eds Varki, A. et al.) Ch. 10 (Cold Spring Harbour Laboratory Press, 2017).
Darula, Z. & Medzihradszky, K. F. Analysis of mammalian O-glycopeptides—we have made a good start, but there is a long way to go. Mol. Cellular Proteomics 17, 2–17 (2018).
Pap, A., Klement, E., Hunyadi-Gulyas, E., Darula, Z. & Medzihradszky, K. F. Status report on the high-throughput characterization of complex intact O-glycopeptide mixtures. J. Am. Soc. Mass Spectrom. 29, 1210–1220 (2018).
Darula, Z., Pap, Á. & Medzihradszky, K. F. Extended sialylated O-glycan repertoire of human urinary glycoproteins discovered and characterized using electron-transfer/higher-energy collision dissociation. J. Proteome Res. 18, 280–291 (2019).
Pap, A., Tasnadi, E., Medzihradszky, K. F. & Darula, Z. Novel O-linked sialoglycan structures in human urinary glycoproteins. Mol. Omi. 16, 156–164 (2020).
Khoo, K. H. Advances toward mapping the full extent of protein site-specific O-GalNAc glycosylation that better reflects underlying glycomic complexity. Curr. Opin. Struct. Biol. 56, 146–154 (2019).
Mao, J. et al. A new searching strategy for the identification of O-linked glycopeptides. Anal. Chem. 91, 3852–3859 (2019).
Izaham, A. R. A. & Scott, N. E. Open database searching enables the identification and comparison of bacterial glycoproteomes without defining glycan compositions prior to searching. Mol. Cell. Proteomics https://doi.org/10.1074/mcp.TIR120.002100 (2020).
Huang, J. et al. Development of a computational tool for automated interpretation of intact O-glycopeptide tandem mass spectra from single proteins. Anal. Chem. 92, 6777–6784 (2020).
Riley, N. M., Malaker, S. A., Driessen, M. & Bertozzi, C. R. Optimal dissociation methods differ for N- and O-glycopeptides. J. Proteome Res. 19, 3286–3301 (2020).
Solntsev, S. K., Shortreed, M. R., Frey, B. L. & Smith, L. M. Enhanced global post-translational modification discovery with MetaMorpheus. J. Proteome Res. 17, 1844–1851 (2018).
Kong, A. T., Leprevost, F. V., Avtonomov, D. M., Mellacheruvu, D. & Nesvizhskii, A. I. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat. Methods 14, 513–520 (2017).
Liu, X. et al. Identification of ultramodified proteins using top-down tandem mass spectra. J. Proteome Res. 12, 5830–5838 (2013).
Frank, A. M., Pesavento, J. J., Mizzen, C. A., Kelleher, N. L. & Pevzner, P. A. Interpreting top-down mass spectra using spectral alignment. Anal. Chem. 80, 2499–2505 (2008).
Pevzner, P. A., Dančík, V. & Tang, C. L. Mutation-tolerant protein identification by mass spectrometry. J. Comput. Biol. 7, 777–787 (2001).
Park, J. et al. Informed-Proteomics: open-source software package for top-down proteomics. Nat. Methods 14, 909–914 (2017).
Taus, T. et al. Universal and confident phosphorylation site localization using phosphoRS. J. Proteome Res. 10, 5354–5362 (2011).
Olsen, J. V. et al. Global, In Vivo, and site-specific phosphorylation dynamics in signaling networks. Cell 127, 635–648 (2006).
Smith, L. M. et al. A five-level classification system for proteoform identifications. Nat. Methods 16, 939–940 (2019).
Marx, H. et al. A large synthetic peptide and phosphopeptide reference library for mass spectrometry-based proteomics. Nat. Biotechnol. 31, 557–564 (2013).
Halim, A. et al. Assignment of saccharide identities through analysis of oxonium ion fragmentation profiles in LC–MS/MS of glycopeptides. J. Proteome Res. 13, 6024–6032 (2014).
Polasky, D. A., Yu, F., Teo, G. C. & Nesvizhskii, A. I. Fast and comprehensive N- and O-glycoproteomics analysis with MSFragger-Glyco. Nat. Methods https://doi.org/10.1038/s41592-020-0967-9 (2020).
Bern, M., Kil, Y. J. & Becker, C. Byonic: advanced peptide and protein identification software. Curr. Protoc. Bioinformatics 40, 13.20.1–13.20.14 (2012).
Bern, M., Cai, Y. & Goldberg, D. Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry. Anal. Chem. 79, 1393–1400 (2007).
Malaker, S. A. et al. The mucin-selective protease StcE enables molecular and functional analysis of human cancer-associated mucins. Proc. Natl Acad. Sci. USA 116, 7278–7287 (2019).
Choo, M. S., Wan, C., Rudd, P. M. & Nguyen-Khuong, T. GlycopeptideGraphMS: improved glycopeptide detection and identification by exploiting graph theoretical patterns in mass and retention time. Anal. Chem. 91, 7236–7244 (2019).
Klein, J. & Zaia, J. Relative retention time estimation improves N-glycopeptide identifications by LC–MS/MS. J. Proteome Res. 19, 2113–2121 (2020).
Khatri, K., Klein, J. A. & Zaia, J. Use of an informed search space maximizes confidence of site-specific assignment of glycoprotein glycosylation. Anal. Bioanal. Chem. 409, 607–618 (2017).
Liu, M. Q. et al. PGlyco 2.0 enables precision N-glycoproteomics with comprehensive quality control and one-step mass spectrometry for intact glycopeptide identification. Nat. Commun. 8, 438 (2017).
Lee, L. Y. et al. Toward automated N-glycopeptide identification in glycoproteomics. J. Proteome Res. 15, 3904–3915 (2016).
The, M., MacCoss, M. J., Noble, W. S. & Käll, L. Fast and accurate protein false discovery rates on large-scale proteomics data sets with percolator 3.0. J. Am. Soc. Mass Spectrom. 27, 1719–1727 (2016).
Chalkley, R. J., Medzihradszky, K. F., Darula, Z., Pap, A. & Baker, P. R. The effectiveness of filtering glycopeptide peak list files for Y ions. Mol. Omi. 16, 147–155 (2020).
Baker, P. R., Trinidad, J. C. & Chalkley, R. J. Modification site localization scoring integrated into a search engine. Proteomics https://doi.org/10.1074/mcp.M111.008078 (2011).
Park, G. W. et al. Classification of mucin-type O-glycopeptides using higher-energy collisional dissociation in mass spectrometry. Anal. Chem. 92, 9772–9781 (2020).
Xu, G., Goonatilleke, E., Wongkham, S. & Lebrilla, C. B. Deep structural analysis and quantitation of O-linked glycans on cell membrane reveal high abundances and distinct glycomic profiles associated with cell type and stages of differentiation. Anal. Chem. 92, 3758–3768 (2020).
Wenger, C. D. & Coon, J. J. A proteomics search algorithm specifically designed for high-resolution tandem mass spectra. J. Proteome Res. 12, 1377–1386 (2013).
Khan, A. & Mathelier, A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics 18, 287, https://doi.org/10.1186/s12859-017-1708-7 (2017).
Lang, T. et al. Searching the evolutionary origin of epithelial mucus protein components—mucins and FCGBP. Mol. Biol. Evol. 33, 1921–1936 (2016).
Shin, J. et al. Use of composite protein database including search result sequences for mass spectrometric analysis of cell secretome. PLoS ONE 10, e0121692 (2015).
Uhlen, M. et al. Tissue-based map of the human proteome. Science 347, 1260419–1260419 (2015).
Bateman, A. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47, D506–D515 (2019).
Park, J. H. et al. Proteomic analysis of host cell protein dynamics in the culture supernatants of antibody-producing CHO cells. Sci. Rep. 7, 44246 (2017).
Perez-Riverol, Y. et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 47, D442–D450 (2019).
We appreciate discussions with Z. Rolfs, R.J. Millikin and other Smith group members to enhance software analysis speed and address challenges in implementing ideas. This work was supported by National Institute of Health (NIH) grant no. R35 GM126914 awarded to L.M.S. and grant no. R01 CA200423 awarded to C.R.B., as well as with support from the Howard Hughes Medical Institute. N.M.R. was funded through an NIH Predoctoral to Postdoctoral Transition Award (grant no. K00 CA212454-03).
C.R.B. is a cofounder and Scientific Advisory Board member of Lycia Therapeutics, Palleon Pharmaceuticals, Enable Bioscience, Redwood Biosciences (a subsidiary of Catalent) and InterVenn Biosciences, and a member of the Board of Directors of Eli Lilly & Company.
Editor recognition statement Arunima Singh was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Lu, L., Riley, N.M., Shortreed, M.R. et al. O-Pair Search with MetaMorpheus for O-glycopeptide characterization. Nat Methods 17, 1133–1138 (2020). https://doi.org/10.1038/s41592-020-00985-5
Nature Methods (2020)
Analytical Chemistry (2020)