Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

StrucGP: de novo structural sequencing of site-specific N-glycan on glycoproteins using a modularization strategy

Abstract

Precision mapping of glycans at structural and site-specific level is still one of the most challenging tasks in the glycobiology field. Here, we describe a modularization strategy for de novo interpretation of N-glycan structures on intact glycopeptides using tandem mass spectrometry. An algorithm named StrucGP is also developed to automate the interpretation process for large-scale analysis. By dividing an N-glycan into three modules and identifying each module using distinct patterns of Y ions or a combination of distinguishable B/Y ions, the method enables determination of detailed glycan structures on thousands of glycosites in mouse brain, which comprise four types of core structure and 17 branch structures with three glycan subtypes. Owing to the database-independent glycan mapping strategy, StrucGP also facilitates the identification of rare/new glycan structures. The approach will be greatly beneficial for in-depth structural and functional study of glycoproteins in the biomedical research.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Principle of the new method for structural interpretation of site-specific N-glycans.
Fig. 2: The performance of StrucGP on structural analysis of site-specific N-glycans from standard glycoproteins.
Fig. 3: Site-specific N-glycan structure analysis of mouse brain.
Fig. 4: StrucGP reveals glycan isoforms and new/rare glycans in mouse brain.
Fig. 5: Error rate control and probability estimation of StrucGP results.
Fig. 6: Comparison of StrucGP with GPQuest v.2.0, Byonic v.3.0 and pGlyco v.2.0.

Similar content being viewed by others

Data availability

The mass spectrometry data, as well as all spectra for identified glycopeptides from different samples, have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository43 with the dataset identifier PXD025859.

Code availability

StrucGP was developed in the python language and the standalone software package can be downloaded at the Zenodo repository (https://doi.org/10.5281/zenodo.4925441)44.

References

  1. Moremen, K. W., Tiemeyer, M. & Nairn, A. V. Vertebrate protein glycosylation: diversity, synthesis and function. Nat. Rev. Mol. Cell Biol. 13, 448–462 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Reily, C., Stewart, T. J., Renfrow, M. B. & Novak, J. Glycosylation in health and disease. Nat. Rev. Nephrol. 15, 346–366 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  3. Ng, B. G. & Freeze, H. H. Perspectives on glycosylation and its congenital disorders. Trends Genet. 34, 466–476 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Stowell, S. R., Ju, T. & Cummings, R. D. Protein glycosylation in cancer. Annu. Rev. Pathol. 10, 473–510 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Dwek, R. A., Butters, T. D., Platt, F. M. & Zitzmann, N. Targeting glycosylation as a therapeutic approach. Nat. Rev. Drug Discov. 1, 65–75 (2002).

    Article  CAS  PubMed  Google Scholar 

  6. Lu, Q., Li, S. & Shao, F. Sweet talk: protein glycosylation in bacterial interaction with the host. Trends Microbiol. 23, 630–641 (2015).

    Article  CAS  PubMed  Google Scholar 

  7. Stencel-Baerenwald, J. E., Reiss, K., Reiter, D. M., Stehle, T. & Dermody, T. S. The sweet spot: defining virus–sialic acid interactions. Nat. Rev. Microbiol. 12, 739–749 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Bhat, A. H., Maity, S., Giri, K. & Ambatipudi, K. Protein glycosylation: sweet or bitter for bacterial pathogens? Crit. Rev. Microbiol. 45, 82–102 (2019).

    Article  CAS  PubMed  Google Scholar 

  9. Sun, S. et al. Comprehensive analysis of protein glycosylation by solid-phase extraction of N-linked glycans and glycosite-containing peptides. Nat. Biotechnol. 34, 84–88 (2016).

    Article  CAS  PubMed  Google Scholar 

  10. Watanabe, Y., Allen, J. D., Wrapp, D., McLellan, J. S. & Crispin, M. Site-specific glycan analysis of the SARS-CoV-2 spike. Science 369, 330–333 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Wu, C. Y. et al. Influenza A surface glycosylation and vaccine design. Proc. Natl Acad. Sci. USA 114, 280–285 (2017).

    Article  CAS  PubMed  Google Scholar 

  12. Xiao, H., Sun, F., Suttapitugsakul, S. & Wu, R. Global and site-specific analysis of protein glycosylation in complex biological systems with mass spectrometry. Mass Spectrom. Rev. 38, 356–379 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Zhu, Z. & Desaire, H. Carbohydrates on proteins: site-specific glycosylation analysis by mass spectrometry. Annu. Rev. Anal. Chem. 8, 463–483 (2015).

    Article  CAS  Google Scholar 

  14. Jensen, P. H., Karlsson, N. G., Kolarich, D. & Packer, N. H. Structural analysis of N-and O-glycans released from glycoproteins. Nat. Protoc. 7, 1299–1310 (2012).

    Article  CAS  PubMed  Google Scholar 

  15. Rojas-Macias, M. A. et al. Towards a standardized bioinformatics infrastructure for N- and O-glycomics. Nat. Commun. 10, 3275 (2019).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  16. Liu, M. Q. et al. pGlyco 2.0 enables precision N-glycoproteomics with comprehensive quality control and one-step mass spectrometry for intact glycopeptide identification. Nat. Commun. 8, 438 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  17. Bern, M., Kil, Y. J. & Becker, C. Byonic: advanced peptide and protein identification software. Curr. Protoc. Bioinformatics, 13.20.11–13.20.14 (2012).

  18. Toghi Eshghi, S., Shah, P., Yang, W., Li, X. & Zhang, H. GPQuest: a spectral library matching algorithm for site-specific assignment of tandem mass spectra to intact N-glycopeptides. Anal. Chem. 87, 5181–5188 (2015).

    Article  CAS  PubMed  Google Scholar 

  19. Polasky, D. A., Yu, F., Teo, G. C. & Nesvizhskii, A. I. Fast and comprehensive N- and O-glycoproteomics analysis with MSFragger-Glyco. Nat. Methods 17, 1125–1132 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Lu, L., Riley, N. M., Shortreed, M. R., Bertozzi, C. R. & Smith, L. M. O-pair search with MetaMorpheus for O-glycopeptide characterization. Nat. Methods 17, 1133–1138 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Yu, C. & Huang, L. Cross-linking mass spectrometry: an emerging technology for interactomics and structural biology. Anal. Chem. 90, 144–165 (2018).

    Article  CAS  PubMed  Google Scholar 

  22. Steentoft, C. et al. Mining the O-glycoproteome using zinc-finger nuclease-glycoengineered SimpleCell lines. Nat. Methods 8, 977–982 (2011).

    Article  CAS  PubMed  Google Scholar 

  23. Liu, F., Rijkers, D. T., Post, H. & Heck, A. J. Proteome-wide profiling of protein assemblies by cross-linking mass spectrometry. Nat. Methods 12, 1179–1184 (2015).

    Article  CAS  PubMed  Google Scholar 

  24. Woo, C. M., Iavarone, A. T., Spiciarich, D. R., Palaniappan, K. K. & Bertozzi, C. R. Isotope-targeted glycoproteomics (IsoTaG): a mass-independent platform for intact N- and O-glycopeptide discovery and analysis. Nat. Methods 12, 561–567 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Marsico, G., Russo, L., Quondamatteo, F. & Pandit, A. Glycosylation and integrin regulation in cancer. Trends Cancer 4, 537–552 (2018).

    Article  CAS  PubMed  Google Scholar 

  26. Jin, W. et al. Glycoqueuing: isomer-specific quantification for sialylation-focused glycomics. Anal. Chem. 91, 10492–10500 (2019).

    Article  CAS  PubMed  Google Scholar 

  27. Wei, J. et al. Toward automatic and comprehensive glycan characterization by online PGC-LC-EED MS/MS. Anal. Chem. 92, 782–791 (2020).

    Article  CAS  PubMed  Google Scholar 

  28. She, Y.-M., Tam, R. Y., Li, X., Rosu-Myles, M. & Sauvé, S. Resolving isomeric structures of native glycans by nanoflow porous graphitized carbon chromatography–mass spectrometry. Anal. Chem. 92, 14038–14046 (2020).

    Article  CAS  PubMed  Google Scholar 

  29. Huang, Y., Nie, Y., Boyes, B. & Orlando, R. Resolving isomeric glycopeptide glycoforms with hydrophilic interaction chromatography (HILIC). J. Biomol. Tech. 27, 98–104 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  30. You, X. et al. Chemoenzymatic approach for the proteomics analysis of mucin-type core-1 O-glycosylation in human serum. Anal. Chem. 90, 12714–12722 (2018).

    Article  CAS  PubMed  Google Scholar 

  31. Yang, M. et al. Separation and preparation of N-glycans based on ammonia-catalyzed release method. Glycoconj. J. 37, 165–174 (2020).

    Article  CAS  PubMed  Google Scholar 

  32. Cao, C. et al. Purification of natural neutral N-glycans by using two-dimensional hydrophilic interaction liquid chromatography × porous graphitized carbon chromatography for glycan-microarray assay. Talanta 221, 121382 (2021).

    Article  CAS  PubMed  Google Scholar 

  33. Devakumar, A., Mechref, Y., Kang, P., Novotny, M. V. & Reilly, J. P. Identification of isomeric N-glycan structures by mass spectrometry with 157 nm laser-induced photofragmentation. J. Am. Soc. Mass. Spectrom. 19, 1027–1040 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Stadlmann, J., Pabst, M., Kolarich, D., Kunert, R. & Altmann, F. Analysis of immunoglobulin glycosylation by LC–ESI–MS of glycopeptides and oligosaccharides. Proteomics 8, 2858–2871 (2008).

    Article  CAS  PubMed  Google Scholar 

  35. De Leoz, M. L. A. et al. NIST interlaboratory study on glycosylation analysis of monoclonal antibodies: comparison of results from diverse analytical methods. Mol. Cell. Proteom. 19, 11–30 (2020).

    Article  Google Scholar 

  36. Pagan, J. D., Kitaoka, M. & Anthony, R. M. Engineered sialylation of pathogenic antibodies in vivo attenuates autoimmune disease. Cell 172, 564–577 e513 (2018).

    Article  CAS  PubMed  Google Scholar 

  37. Rendic, D., Wilson, I. B. H. & Paschinger, K. The glycosylation capacity of insect cells. Croat. Chem. Acta 81, 7–21 (2008).

    CAS  Google Scholar 

  38. Hu, Y., Shah, P., Clark, D. J., Ao, M. & Zhang, H. Reanalysis of global proteomic and phosphoproteomic data identified a large number of glycopeptides. Anal. Chem. 90, 8065–8071 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Nesvizhskii, A. I., Vitek, O. & Aebersold, R. Analysis and validation of proteomic data generated by tandem mass spectrometry. Nat. Methods 4, 787–797 (2007).

    Article  CAS  PubMed  Google Scholar 

  40. Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Mucha, E. et al. Fucose migration in intact protonated glycan ions: a universal phenomenon in mass spectrometry. Angew. Chem. Int. Ed. Engl. 57, 7440–7443 (2018).

    Article  CAS  PubMed  Google Scholar 

  42. Deutsch, E. W. et al. Trans-proteomic pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics. Proteom. Clin. Appl. 9, 745–754 (2015).

    Article  CAS  Google Scholar 

  43. Shen, J. & Sun, S. StrucGP: a software for structural interpretation of N-glycans on intact glycopeptides using tandem mass spectrometry data (Zenodo, 2021); https://doi.org/10.5281/zenodo.4925441

  44. Vizcaíno, J. A. et al. The PRoteomics IDEntifications (PRIDE) database and associated tools: status in 2013. Nucleic Acids Res. 41, D1063–D1069 (2013).

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China, grant nos. 91853123 (S.S.), 81773180 (S.S.), 21705127 (S.S.) and 81800655 (L.D.); National Key R&D Program of China, grant no. 2019YFA0905200 (S.S.) and China Postdoctoral Science Foundation, grant nos. 2019M653715 (L.D.), 2019TQ0260 (J.L.) and 2019M663798 (J.L.).

Author information

Authors and Affiliations

Authors

Contributions

S.S. and J.S. planned and designed the project. J.S., Y.S. and Jie Zhang developed the algorithm. L.J., R.L., Z.H., T.Z. and B.J. optimized protocols and prepared samples. B.Z., L.J., Z.H. and C.M. performed mass spectrometry analyses with MS parameter optimizations. J.S., S.S., J.L., Y.X., L.D., Z.C., J.W., C.M., N.G., J.B. and Y.Z. contributed to the data analyses and figure generation. Junying Zhang and S.S. coordinated the study. L.D., S.S. and J.S. prepared the manuscript with contributions from all coauthors.

Corresponding author

Correspondence to Shisheng Sun.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Arunima Singh was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Workflow of StrucGP.

Workflow of StrucGP for glycan structure interpretation on intact glycopeptides. Related to Fig. 1a.

Extended Data Fig. 2 Glycan structures identified from standard glycoproteins.

The standard glycoproteins include (a) bovine RNase B, (b) bovine fetuin, (c) ovalbumin from chicken, and (d) human IgG including IgG1-4. e, An example of intact glycopeptide identification from fetuin using StrucGP. Related to Fig. 2b–e and Supplementary Table 1.

Extended Data Fig. 3 Site-specific N-glycan structure analysis of mouse brain.

a, Recognition and identification of four types of core structures in the mouse brain using their feature Y ion patterns. b, Determination of glycan subtypes in mouse brain using three feature Y ion patterns. Related to Fig. 3b,c.

Extended Data Fig. 4 Determination of branch glycan structures using feature B ions.

The branch glycan structures were identified in mouse brain and standard glycoproteins. The example spectra of each branch structure are shown in Supplementary Data 1.

Extended Data Fig. 5 Glycan structures identified in mouse brain.

A total of 600 N-glycan structures were identified in mouse brain. It is worth to mention that the location of each branch structure (such as α-2,3 or α-2,6 mannose) can’t actually be identified by StrucGP. Related to Fig. 3 and Supplementary Table 2.

Extended Data Fig. 6 Glycan isoforms sharing the same glycan compositions.

This figure shows glycan compositions that have identified at least seven glycan isoforms in the mouse brain. Related to Fig. 4a,b.

Extended Data Fig. 7 Spectra of glycopeptides with three glycan isoforms of N4H5F1 on the same peptide.

The peptide TLN#CSGAHVK was modified by three different isoforms of the glycan HexNAc4Hex5Fuc1 (N4H5F1). Upper: peptide sequences identification using a MS/MS spectrum of high HCD energy (HCD = 30%). Lower: glycan structure determination using MS/MS spectra of low HCD energy (HCD = 20%). Related to Fig. 4c.

Extended Data Fig. 8 Spectra of glycopeptides with two glycan isoforms of N4H5F1 on the same peptide.

The peptide GN#GTLITFHSAFQCCGK was modified by two different isoforms of the glycan HexNAc4Hex5Fuc1 (N4H5F1). Upper: peptide sequences identification using a MS/MS spectrum of high HCD energy (HCD = 33%). Lower: glycan structure determination using MS/MS spectra of low HCD energy (HCD = 20%). Related to Fig. 4c.

Extended Data Fig. 9 Glycan structures identified from five mouse tissues.

A total of 719 N-glycan structures were identified from five mouse tissues. The 25 mass spectrometry raw files (five raw files per mouse tissue) reported in the pGlyco 2.0 study were downloaded from the ProteomeXchange database.

Extended Data Fig. 10 Expression patterns of site-specific glycan structures in different mouse tissues.

The 25 mass spectrometry raw files reported in the pGlyco 2.0 study were downloaded from the ProteomeXchange database. a, Overall processes of intact glycopeptide analysis of five different mouse tissues. b, Percentages of identified spectra that had at least 99% probabilities of related substructures from five mouse tissue data. Comparison of glycan core structures (c) and glycan subtypes (d) among intact glycopeptides identified by StrucGP from five mouse tissues. *indicates mouse brain glycopeptides identified from this study. e, Comparison of branch structures among five mouse tissues. The percentages in this figure were calculated based on the numbers of unique glycopeptides. Different isomers can be distinguished by specific feature B ions, the same method as shown in Extended Data Fig. 4.

Supplementary information

Supplementary Information

Supplementary Notes 1–5, Figs. 1–5 and Table 6.

Reporting Summary

Supplementary Tables

Supplementary Table 1. List of intact glycopeptides with detailed glycan structures identified from standard glycoproteins with and without exoglycosidase treatments using StrucGP. Supplementary Table 2. List of intact glycopeptides with detailed glycan structures identified from mouse brain using StrucGP. Supplementary Table 3. List of intact glycopeptides identified in exoglycosidase-treated mouse brain samples using StrucGP. Supplementary Table 4. List of intact glycopeptides with detailed glycan structures identified from Drosophila Schneider-2 cells using StrucGP. Supplementary Table 5. List of intact glycopeptides with detailed glycan structures identified from five mouse tissues by analyzing previously reported mass spectrometry data with StrucGP.

Supplementary Data

Determination of branch glycan structures in mouse brain and standard glycoproteins using feature B ions with further validation using Y ions.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shen, J., Jia, L., Dang, L. et al. StrucGP: de novo structural sequencing of site-specific N-glycan on glycoproteins using a modularization strategy. Nat Methods 18, 921–929 (2021). https://doi.org/10.1038/s41592-021-01209-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-021-01209-0

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics