Abstract
Precision mapping of glycans at structural and site-specific level is still one of the most challenging tasks in the glycobiology field. Here, we describe a modularization strategy for de novo interpretation of N-glycan structures on intact glycopeptides using tandem mass spectrometry. An algorithm named StrucGP is also developed to automate the interpretation process for large-scale analysis. By dividing an N-glycan into three modules and identifying each module using distinct patterns of Y ions or a combination of distinguishable B/Y ions, the method enables determination of detailed glycan structures on thousands of glycosites in mouse brain, which comprise four types of core structure and 17 branch structures with three glycan subtypes. Owing to the database-independent glycan mapping strategy, StrucGP also facilitates the identification of rare/new glycan structures. The approach will be greatly beneficial for in-depth structural and functional study of glycoproteins in the biomedical research.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
pGlycoQuant with a deep residual network for quantitative glycoproteomics at intact glycopeptide level
Nature Communications Open Access 07 December 2022
-
Glyco-Decipher enables glycan database-independent peptide matching and in-depth characterization of site-specific N-glycosylation
Nature Communications Open Access 07 April 2022
-
Mammalian brain glycoproteins exhibit diminished glycan complexity compared to other tissues
Nature Communications Open Access 12 January 2022
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout






Data availability
The mass spectrometry data, as well as all spectra for identified glycopeptides from different samples, have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository43 with the dataset identifier PXD025859.
Code availability
StrucGP was developed in the python language and the standalone software package can be downloaded at the Zenodo repository (https://doi.org/10.5281/zenodo.4925441)44.
References
Moremen, K. W., Tiemeyer, M. & Nairn, A. V. Vertebrate protein glycosylation: diversity, synthesis and function. Nat. Rev. Mol. Cell Biol. 13, 448–462 (2012).
Reily, C., Stewart, T. J., Renfrow, M. B. & Novak, J. Glycosylation in health and disease. Nat. Rev. Nephrol. 15, 346–366 (2019).
Ng, B. G. & Freeze, H. H. Perspectives on glycosylation and its congenital disorders. Trends Genet. 34, 466–476 (2018).
Stowell, S. R., Ju, T. & Cummings, R. D. Protein glycosylation in cancer. Annu. Rev. Pathol. 10, 473–510 (2015).
Dwek, R. A., Butters, T. D., Platt, F. M. & Zitzmann, N. Targeting glycosylation as a therapeutic approach. Nat. Rev. Drug Discov. 1, 65–75 (2002).
Lu, Q., Li, S. & Shao, F. Sweet talk: protein glycosylation in bacterial interaction with the host. Trends Microbiol. 23, 630–641 (2015).
Stencel-Baerenwald, J. E., Reiss, K., Reiter, D. M., Stehle, T. & Dermody, T. S. The sweet spot: defining virus–sialic acid interactions. Nat. Rev. Microbiol. 12, 739–749 (2014).
Bhat, A. H., Maity, S., Giri, K. & Ambatipudi, K. Protein glycosylation: sweet or bitter for bacterial pathogens? Crit. Rev. Microbiol. 45, 82–102 (2019).
Sun, S. et al. Comprehensive analysis of protein glycosylation by solid-phase extraction of N-linked glycans and glycosite-containing peptides. Nat. Biotechnol. 34, 84–88 (2016).
Watanabe, Y., Allen, J. D., Wrapp, D., McLellan, J. S. & Crispin, M. Site-specific glycan analysis of the SARS-CoV-2 spike. Science 369, 330–333 (2020).
Wu, C. Y. et al. Influenza A surface glycosylation and vaccine design. Proc. Natl Acad. Sci. USA 114, 280–285 (2017).
Xiao, H., Sun, F., Suttapitugsakul, S. & Wu, R. Global and site-specific analysis of protein glycosylation in complex biological systems with mass spectrometry. Mass Spectrom. Rev. 38, 356–379 (2019).
Zhu, Z. & Desaire, H. Carbohydrates on proteins: site-specific glycosylation analysis by mass spectrometry. Annu. Rev. Anal. Chem. 8, 463–483 (2015).
Jensen, P. H., Karlsson, N. G., Kolarich, D. & Packer, N. H. Structural analysis of N-and O-glycans released from glycoproteins. Nat. Protoc. 7, 1299–1310 (2012).
Rojas-Macias, M. A. et al. Towards a standardized bioinformatics infrastructure for N- and O-glycomics. Nat. Commun. 10, 3275 (2019).
Liu, M. Q. et al. pGlyco 2.0 enables precision N-glycoproteomics with comprehensive quality control and one-step mass spectrometry for intact glycopeptide identification. Nat. Commun. 8, 438 (2017).
Bern, M., Kil, Y. J. & Becker, C. Byonic: advanced peptide and protein identification software. Curr. Protoc. Bioinformatics, 13.20.11–13.20.14 (2012).
Toghi Eshghi, S., Shah, P., Yang, W., Li, X. & Zhang, H. GPQuest: a spectral library matching algorithm for site-specific assignment of tandem mass spectra to intact N-glycopeptides. Anal. Chem. 87, 5181–5188 (2015).
Polasky, D. A., Yu, F., Teo, G. C. & Nesvizhskii, A. I. Fast and comprehensive N- and O-glycoproteomics analysis with MSFragger-Glyco. Nat. Methods 17, 1125–1132 (2020).
Lu, L., Riley, N. M., Shortreed, M. R., Bertozzi, C. R. & Smith, L. M. O-pair search with MetaMorpheus for O-glycopeptide characterization. Nat. Methods 17, 1133–1138 (2020).
Yu, C. & Huang, L. Cross-linking mass spectrometry: an emerging technology for interactomics and structural biology. Anal. Chem. 90, 144–165 (2018).
Steentoft, C. et al. Mining the O-glycoproteome using zinc-finger nuclease-glycoengineered SimpleCell lines. Nat. Methods 8, 977–982 (2011).
Liu, F., Rijkers, D. T., Post, H. & Heck, A. J. Proteome-wide profiling of protein assemblies by cross-linking mass spectrometry. Nat. Methods 12, 1179–1184 (2015).
Woo, C. M., Iavarone, A. T., Spiciarich, D. R., Palaniappan, K. K. & Bertozzi, C. R. Isotope-targeted glycoproteomics (IsoTaG): a mass-independent platform for intact N- and O-glycopeptide discovery and analysis. Nat. Methods 12, 561–567 (2015).
Marsico, G., Russo, L., Quondamatteo, F. & Pandit, A. Glycosylation and integrin regulation in cancer. Trends Cancer 4, 537–552 (2018).
Jin, W. et al. Glycoqueuing: isomer-specific quantification for sialylation-focused glycomics. Anal. Chem. 91, 10492–10500 (2019).
Wei, J. et al. Toward automatic and comprehensive glycan characterization by online PGC-LC-EED MS/MS. Anal. Chem. 92, 782–791 (2020).
She, Y.-M., Tam, R. Y., Li, X., Rosu-Myles, M. & Sauvé, S. Resolving isomeric structures of native glycans by nanoflow porous graphitized carbon chromatography–mass spectrometry. Anal. Chem. 92, 14038–14046 (2020).
Huang, Y., Nie, Y., Boyes, B. & Orlando, R. Resolving isomeric glycopeptide glycoforms with hydrophilic interaction chromatography (HILIC). J. Biomol. Tech. 27, 98–104 (2016).
You, X. et al. Chemoenzymatic approach for the proteomics analysis of mucin-type core-1 O-glycosylation in human serum. Anal. Chem. 90, 12714–12722 (2018).
Yang, M. et al. Separation and preparation of N-glycans based on ammonia-catalyzed release method. Glycoconj. J. 37, 165–174 (2020).
Cao, C. et al. Purification of natural neutral N-glycans by using two-dimensional hydrophilic interaction liquid chromatography × porous graphitized carbon chromatography for glycan-microarray assay. Talanta 221, 121382 (2021).
Devakumar, A., Mechref, Y., Kang, P., Novotny, M. V. & Reilly, J. P. Identification of isomeric N-glycan structures by mass spectrometry with 157 nm laser-induced photofragmentation. J. Am. Soc. Mass. Spectrom. 19, 1027–1040 (2008).
Stadlmann, J., Pabst, M., Kolarich, D., Kunert, R. & Altmann, F. Analysis of immunoglobulin glycosylation by LC–ESI–MS of glycopeptides and oligosaccharides. Proteomics 8, 2858–2871 (2008).
De Leoz, M. L. A. et al. NIST interlaboratory study on glycosylation analysis of monoclonal antibodies: comparison of results from diverse analytical methods. Mol. Cell. Proteom. 19, 11–30 (2020).
Pagan, J. D., Kitaoka, M. & Anthony, R. M. Engineered sialylation of pathogenic antibodies in vivo attenuates autoimmune disease. Cell 172, 564–577 e513 (2018).
Rendic, D., Wilson, I. B. H. & Paschinger, K. The glycosylation capacity of insect cells. Croat. Chem. Acta 81, 7–21 (2008).
Hu, Y., Shah, P., Clark, D. J., Ao, M. & Zhang, H. Reanalysis of global proteomic and phosphoproteomic data identified a large number of glycopeptides. Anal. Chem. 90, 8065–8071 (2018).
Nesvizhskii, A. I., Vitek, O. & Aebersold, R. Analysis and validation of proteomic data generated by tandem mass spectrometry. Nat. Methods 4, 787–797 (2007).
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).
Mucha, E. et al. Fucose migration in intact protonated glycan ions: a universal phenomenon in mass spectrometry. Angew. Chem. Int. Ed. Engl. 57, 7440–7443 (2018).
Deutsch, E. W. et al. Trans-proteomic pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics. Proteom. Clin. Appl. 9, 745–754 (2015).
Shen, J. & Sun, S. StrucGP: a software for structural interpretation of N-glycans on intact glycopeptides using tandem mass spectrometry data (Zenodo, 2021); https://doi.org/10.5281/zenodo.4925441
Vizcaíno, J. A. et al. The PRoteomics IDEntifications (PRIDE) database and associated tools: status in 2013. Nucleic Acids Res. 41, D1063–D1069 (2013).
Acknowledgements
This work was supported by the National Natural Science Foundation of China, grant nos. 91853123 (S.S.), 81773180 (S.S.), 21705127 (S.S.) and 81800655 (L.D.); National Key R&D Program of China, grant no. 2019YFA0905200 (S.S.) and China Postdoctoral Science Foundation, grant nos. 2019M653715 (L.D.), 2019TQ0260 (J.L.) and 2019M663798 (J.L.).
Author information
Authors and Affiliations
Contributions
S.S. and J.S. planned and designed the project. J.S., Y.S. and Jie Zhang developed the algorithm. L.J., R.L., Z.H., T.Z. and B.J. optimized protocols and prepared samples. B.Z., L.J., Z.H. and C.M. performed mass spectrometry analyses with MS parameter optimizations. J.S., S.S., J.L., Y.X., L.D., Z.C., J.W., C.M., N.G., J.B. and Y.Z. contributed to the data analyses and figure generation. Junying Zhang and S.S. coordinated the study. L.D., S.S. and J.S. prepared the manuscript with contributions from all coauthors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Arunima Singh was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Workflow of StrucGP.
Workflow of StrucGP for glycan structure interpretation on intact glycopeptides. Related to Fig. 1a.
Extended Data Fig. 3 Site-specific N-glycan structure analysis of mouse brain.
a, Recognition and identification of four types of core structures in the mouse brain using their feature Y ion patterns. b, Determination of glycan subtypes in mouse brain using three feature Y ion patterns. Related to Fig. 3b,c.
Extended Data Fig. 4 Determination of branch glycan structures using feature B ions.
The branch glycan structures were identified in mouse brain and standard glycoproteins. The example spectra of each branch structure are shown in Supplementary Data 1.
Extended Data Fig. 6 Glycan isoforms sharing the same glycan compositions.
This figure shows glycan compositions that have identified at least seven glycan isoforms in the mouse brain. Related to Fig. 4a,b.
Extended Data Fig. 7 Spectra of glycopeptides with three glycan isoforms of N4H5F1 on the same peptide.
The peptide TLN#CSGAHVK was modified by three different isoforms of the glycan HexNAc4Hex5Fuc1 (N4H5F1). Upper: peptide sequences identification using a MS/MS spectrum of high HCD energy (HCD = 30%). Lower: glycan structure determination using MS/MS spectra of low HCD energy (HCD = 20%). Related to Fig. 4c.
Extended Data Fig. 8 Spectra of glycopeptides with two glycan isoforms of N4H5F1 on the same peptide.
The peptide GN#GTLITFHSAFQCCGK was modified by two different isoforms of the glycan HexNAc4Hex5Fuc1 (N4H5F1). Upper: peptide sequences identification using a MS/MS spectrum of high HCD energy (HCD = 33%). Lower: glycan structure determination using MS/MS spectra of low HCD energy (HCD = 20%). Related to Fig. 4c.
Extended Data Fig. 9 Glycan structures identified from five mouse tissues.
A total of 719 N-glycan structures were identified from five mouse tissues. The 25 mass spectrometry raw files (five raw files per mouse tissue) reported in the pGlyco 2.0 study were downloaded from the ProteomeXchange database.
Extended Data Fig. 10 Expression patterns of site-specific glycan structures in different mouse tissues.
The 25 mass spectrometry raw files reported in the pGlyco 2.0 study were downloaded from the ProteomeXchange database. a, Overall processes of intact glycopeptide analysis of five different mouse tissues. b, Percentages of identified spectra that had at least 99% probabilities of related substructures from five mouse tissue data. Comparison of glycan core structures (c) and glycan subtypes (d) among intact glycopeptides identified by StrucGP from five mouse tissues. *indicates mouse brain glycopeptides identified from this study. e, Comparison of branch structures among five mouse tissues. The percentages in this figure were calculated based on the numbers of unique glycopeptides. Different isomers can be distinguished by specific feature B ions, the same method as shown in Extended Data Fig. 4.
Supplementary information
Supplementary Information
Supplementary Notes 1–5, Figs. 1–5 and Table 6.
Supplementary Tables
Supplementary Table 1. List of intact glycopeptides with detailed glycan structures identified from standard glycoproteins with and without exoglycosidase treatments using StrucGP. Supplementary Table 2. List of intact glycopeptides with detailed glycan structures identified from mouse brain using StrucGP. Supplementary Table 3. List of intact glycopeptides identified in exoglycosidase-treated mouse brain samples using StrucGP. Supplementary Table 4. List of intact glycopeptides with detailed glycan structures identified from Drosophila Schneider-2 cells using StrucGP. Supplementary Table 5. List of intact glycopeptides with detailed glycan structures identified from five mouse tissues by analyzing previously reported mass spectrometry data with StrucGP.
Supplementary Data
Determination of branch glycan structures in mouse brain and standard glycoproteins using feature B ions with further validation using Y ions.
Rights and permissions
About this article
Cite this article
Shen, J., Jia, L., Dang, L. et al. StrucGP: de novo structural sequencing of site-specific N-glycan on glycoproteins using a modularization strategy. Nat Methods 18, 921–929 (2021). https://doi.org/10.1038/s41592-021-01209-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-021-01209-0
This article is cited by
-
Application of StrucGP in medical immunology: site-specific N-glycoproteomic analysis of macrophages
Frontiers of Medicine (2023)
-
Sensitive and specific detection of saccharide species based on fluorescence: update from 2016
Analytical and Bioanalytical Chemistry (2023)
-
Mammalian brain glycoproteins exhibit diminished glycan complexity compared to other tissues
Nature Communications (2022)
-
pGlycoQuant with a deep residual network for quantitative glycoproteomics at intact glycopeptide level
Nature Communications (2022)
-
Glycoproteomics
Nature Reviews Methods Primers (2022)