Highly sensitive characterization of non-human glycan structures of monoclonal antibody drugs utilizing tandem mass spectrometry

Glycosylation is an important attribute of monoclonal antibodies (mAbs) for assessing manufacturing quality. Analysis of non-human glycans containing terminal galactose-α1,3-galactose and N-glycolylneuraminic acid is essential due to the potential immunogenicity and insufficient efficacy caused by mAb expression in non-human mammalian cells. Using parallel sequencing of isobaric glycopeptides and isomeric glycans that were separated by reversed-phase and porous graphitic carbon LC, we report a highly sensitive LC MS/MS method for the comprehensive characterization of low-abundance non-human glycans and their closely related structural isomers. We demonstrate that the straightforward use of high-abundance diagnostic ions and complementary fragments under the positive ionization low-energy collision-induced dissociation is a universal approach to rapidly discriminate branch-linkage structures of biantennary glycans. Our findings reveal the structural diversity of non-human glycans and sulfation of α-galactosylated glycans, providing both an analytical method and candidate structures that could potentially be used in the crucial quality control of therapeutic mAb products.

www.nature.com/scientificreports/ to endogenous anti-Neu5Gc antibodies 27 . Due to the distinguishable functions between these non-human and conventional human-type epitopes, the differentiation of glycan analogues with similar structures is of high importance. A thorough evaluation of glycan variations would therefore properly delineate and characterize non-human glycan structures and postglycosylational modifications to ensure the safety and efficiency of mAb drugs. Monoclonal antibody-based pharmaceuticals primarily comprise biantennary N-glycan structures, which have been characterized using liquid chromatography and tandem mass spectrometry (LC MS/MS); however, identification and quantification of closely related non-human structural isomers including terminal α1,3-Gal and Neu5Gc-linked glycans from human-type glycans represent major challenges of mAb glycomics 21,[28][29][30] . While hydrophilic interaction liquid chromatography (HILIC) is commonly used for separating glycans from proteolytic glycopeptides and released glycans 30 , the method is restricted by low solubility of anionic and large-sized glycans in organic solvents and the poor separation of linkage-specific isomers. The traditional use of glycan chemical derivatization and extensive purification required for HILIC MS/MS analyses is time-consuming, and the additional drawbacks involved incomplete derivatization, undesired side reactions and possible degradation further limit its detection of low-abundance glycans. Increasing applications have been extended to analyses of linkage and branch-specific isomers of sialylated glycans using porous graphitic carbon (PGC) chromatography [31][32][33][34] . Progress has been achieved in the structural characterization of underivatized glycan isomers by PGC LC MS/ MS 32 . We have recently demonstrated that specific diagnostic ions and complementary fragments resulting from the low-energy collision-induced dissociation (CID) at the positive ionization can be used to discriminate α2,3 and α2,6 linkage sialoglycan isomers and sulfated glycans, providing an alternative method to those by means of low-abundance cross-ring cleavages induced by high-energy or negative ionization MS/MS [30][31][32]35 . This method provides high sensitivity for detecting low-abundance glycans and easy data interpretation of structural isomers, as well as a simple procedure that does not require chemical derivatization [36][37][38][39] . Furthermore, our studies showed that the straightforward glycoproteomic and glycomic approaches to analyze glycopeptides and underivatized glycans using mild conditions are able to preserve the structural integrity of anionic glycans 32,40,41 , and avoid the loss of acid-labile groups under the harsh acidic or alkaline conditions. Although glycan sequencing was established for the identification of sialic acid-linkage isomers and regioisomers of sulfoglycans 32 , the previous method did not report branch-specific glycan isomers of non-human glycan epitopes that are typical glycoforms present in therapeutic mAbs, nor distinguish them from similar conventional human-type glycans.
To achieve a thorough characterization of biologically relevant glycans with non-human glycan epitopes in mAbs, herein we conducted a parallel sequencing analysis of the N-glycopeptides and released glycans from mAb drugs produced in CHO cells (bevacizumab, rituximab, trastuzumab, adalimumab) and murine myeloma cell lines (infliximab, cetuximab, golimumab, palivizumab). An in-house glycan library is utilized for the glycoproteomic identification and in-depth characterization of isobaric glycopeptides by reversed-phase (RP) LC MS/MS. Comparative analyses of the released glycans by endoglycosidase PNGase F were conducted by PGC LC MS/MS, and non-human glycan isomers were subsequently identified by MS/MS sequencing using high abundance diagnostic ions and complementary fragment counterparts, exoglycosidase digestion and LC elution order of glycans. Through such analyses, our results identified low abundance non-human glycans of mAbs and importantly, delineated them from conventional isobaric and isomeric human-type glycan structures. We envision the straightforward glycan sequencing methodology and candidate structures can be utilized to evaluate the presence of non-human glycans in glycoprotein-based biological products.

Results and discussion
Identification of nonhuman N-glycans and the unusual N-glycan sulfation involving α-Gal. To first identify non-human glycans in various mAbs, we performed RP LC MS/MS analyses of tryptic digests of eight mAbs which revealed large sets of N-glycopeptides (Tables S1-S5), localizing in two distinct peak regions (Fig. S1). Consistent with previous literature 42 , the short Fc glycopeptides at residues EEQYNSTYR eluted faster than the long Fab glycopeptides at residues MNSLQSNDTAIYYCAR. MS/MS measurements determined a tremendous structural diversity of high-mannose, hybrid and biantennary complex N-glycans in the Fc domain [43][44][45] . Triantennary N-glycans of Fab glycopeptides were exclusively detected on cetuximab (Table S4). In addition, we also observed some low-intensity glycopeptides and conducted a retrospective examination on the structural features.
The α-Gal-containing N-glycans were all fucosylated in the mAb glycopeptides (Tables S3-S4), and most of them have been known to exist as glycans with α1,6 core-fucosylation 21,28,46 . Fig. 1a,b show several types of glycopeptides containing α-Gal N-glycans (Table S6), in which the relatively high abundance glycoforms contain single and double α-Gal residues on the biantennary N-glycan structures of Fc domains and the Fab region of cetuximab. The aforementioned diagnostic α-Gal-containing glycan ions accompanied by the high-intensity complimentary fragments in the MS/MS spectra were utilized to readily identify biantennary non-human glycans containing branched terminal α-Gal and Neu5Gc (Fig. 1c-f) Notably, our database search identified the sulfation of α1,3-Gal-containing Fc N-glycans in palivizumab, golimumab, infliximab and cetuximab. Figure 2 shows the MS/MS spectra of sulfated palivizumab N-glycopeptides at residues TRPREEQYNSTYR at the triply charged ions of m/z 1120.1225 and m/z 1174.1401. Two glycopeptides www.nature.com/scientificreports/ containing either a conventional single Gal-β1,4-GlcNAc or a non-human Gal-α1,3-Gal-β1,4-GlcNAc extension at the termini of N-glycans were co-eluted as an integrated peak at ~ 11.70 min (Fig. 2a). Both glycopeptides displayed the base peak fragments of the triply charged ion [M-80 Da + 3H] 3+ , and the doubly charged fragment ions followed by the consecutive losses of GlcNAc and neutral molecules of 80 Da (Fig. 2b,c). Accurate mass measurements of the glycopeptides indicates the 80 Da molecule is a sulfate (SO 3 2− , 79.9658 u) rather than phosphate group (HPO 3 − , 79.9663 u), resulting in mass errors of 0 ppm and −3 ppm, respectively. We propose that the digalactosylated structure ( Fig. 2c) contains an α-Gal-Gal-GlcNAc-Man branch, as indicated by the presence of the two diagnostic ions at m/z 528.14 ([α-Gal-Gal-GlcNAc + H] + ) and m/z 690.22 ([α-Gal-Gal-GlcNAc-Man + H] + ), and the intensity of the former fragment is higher than that of the m/z 366.21 ([Hex-GlcNAc + H] + ) peak. In contrast, the monogalactosylated species (Fig. 2b) shows a relatively higher fragment at m/z 366.15 compared to the ion at m/z 528.25 (with no additional fragment at m/z 690.22), which can be attributed to the terminal branch fragmentations of Gal-GlcNAc and Gal-GlcNAc-Man, respectively. The possible human-type digalactosylated glycan G2F, possessing the identical mass to that of the digalactosylated glycan containing a terminal Gal-α1,3-Gal residue in Fig. 2c, is excluded, as the predicted fragmentation would lead to a high-intensity branch-specific peak at m/z 366.15 ([Gal-GlcNAc + H] + ) as in Fig. 2b, which is not observed in Fig. 2c. We thus conclude that sulfation occurs on an N-glycan comprising the non-human α-Gal epitope.
Overall, our analyses at the glycoproteomic level revealed the co-existence of non-human glycans in the Fc and Fab domains of mAb products, with varying abundance. Interestingly, we observed negatively charged sulfation at α-Gal containing N-glycans of the recombinant mAbs expressed from the murine myeloma cells. While sulfation commonly occurs in hybrid and complex-type N-glycans of therapeutic proteins at the C-3 of Gal, C-6 of GlcNAc and C-4 of N-acetylgalactosamine (GalNAc) residues of terminal Gal-GlcNAc and GalNAc-GlcNAc sequons 40,41,47 , the finding of sulfated α-Gal-glycans in mAb drugs suggests a previously unrecognized quality attribute that may require monitoring, and further understanding of its biological properties.   www.nature.com/scientificreports/ of the two glycopeptides, and the predicted cysteine-carbamidomethylated glycopeptide (4144.6249 Da) suggest additional S-carbamidomethylation of methionine (+ 57.0215 Da) in the peptide sequence (MNSLQSNDTAI-YYCAR) from iodoacetamide treatments, and the deamidation of asparagine residue (+ 0.9840 Da) of the latter peptide (Table S4). Both glycopeptides yielded the high abundance fragments following the losses of 105 Da from the glycopeptide precursor ion, corresponding to 2-(methylthio)acetamide (C 3 H 7 NOS, 105.0248 Da) 40 . Therefore, glycopeptide sequencing using high abundance diagnostic ions and complementary fragments simplified data interpretation to discriminate the subtle structural difference from either glycan compositions or peptide modifications of glycopeptide analogues.

Differentiation of isobaric Neu5Gc and α-Gal-containing glycopeptides.
Structural diversity of α-Gal containing glycan isomers. To better identify the structure of glycan isomers, mAb glycoproteins were digested by PNGase F and the released glycans were analyzed by PGC LC MS/ MS. As demonstrated by our previous studies 32 , the structural isomers of native glycans can be well-resolved using positive ionization low-energy CID. Glycomic screening of eight mAbs revealed the most abundant glycans of G0F, G1F and G2F ( Fig. S8 and Tables S7-S8). Pairs of agalactosylated, monogalactosylated and digalactosylated glycans are reasonably recognized as structural anomers in which the α-anomer has higher abundance and longer RT than the β-anomer 32 . The chromatographic profile shows that the relative intensities of the neutral glycans are comparable to those glycopeptides obtained by RP LC MS/MS (Fig. S1). Subsequent structure assignment of low abundance glycans was accomplished based on the elution order of glycan isomers, abundant MS/MS fragments and exoglycosidase sequencing 32 . Indeed, LC retention time of biantennary glycans represents the most critical element for structural elucidation of glycan isomers due to its direct relevance to the isomeric separation by PGC 28 . Extensive studies have shown that the complex glycan with an elongated branch chain at the α6 antenna of the N-acetyl chitobiose (GlcNAc 2 )-containing-trimannosyl core elutes faster than the isomeric glycan at the α3 antenna on PGC 28,35,[49][50][51] .
To gain further insight into delineating differences in the structure-based MS/MS fragmentation of biantennary glycans, we examined the PGC LC MS/MS fragmentation patterns of G1 and G1F glycans from glycan standards and known structural isomers of fetuin and mAbs (Figs. S9-S10). Branch-specific fragmentation of the GlcNAc-β1,2-Man glycosidic linkage of the α3 antenna consistently yielded higher intensity B and Y fragments than those of the corresponding linkage on the α6 antenna. Structural modeling of the glycans using Glycam shows potential stabilizing intramolecular hydrogen bonds between the C3 hydroxyl of the α6 mannose residue of the trimannosyl core with the cyclic oxygen atom of the adjacent non-reducing GlcNAc residue, with the α6 antenna having a slightly shorter distance compared to the α3 antenna (2.40 vs 2.51 Å); an additional potential hydrogen bond is observed between the C2 NHAc group of the core GlcNAc residue with the non-reducing GlcNAc residue of the α6 antenna, an interaction that is absent in the α3 antenna (Fig. S11a,b). Such structural features are also visible between the β1,2-GlcNAc linkage branch of α6 antenna and the trimannosyl core in triantennary and tetraantennary glycans (Fig. S11c,d). These data are consistent with the fragmentation patterns of the other glycans reported previously 28 .
Using this method, along with retrospective analyses on glycan isomers, we confirmed that (1) biantennary complex glycan isomers containing a reducing end N-acetyl chitobiose core were separated on an elution order of the glycan with an elongated branch chain on the α6 antenna followed by that on the α3 antenna by PGC; (2) the presence of high abundance diagnostic ions, accompanied by the complementary MS/MS fragments with the neutral loss from precursor ions, is often indicative of the branching side chain of a glycan; (3) MS/MS fragmentation of a biantennary complex glycan yields a higher intensity fragment with the neutral loss of branch at the α3 antenna than that at the α6 antenna; (4) the loss of water from complementary Y fragments to form [Y-18] + ions result from either the reducing end of the glycan β-anomer, or the branch-specific single fragment at the intact α6 antenna of glycan α-anomers, corroborating our recent report 32 . We thus establish the acceptable criteria for sequencing biantennary glycan isomers which constitute the base for subsequent characterization of isomeric structures of non-human glycans in mAb drugs (Table S9).
With the ability to better separate released glycans using PGC compared to glycopeptides using RP LC, and importantly without the need for further chemical derivatization, we delineated several isomeric glycoforms, including low abundance non-human glycans from human-compatible glycans (Fig. 4). A parallel comparison of the EICs of proteolytic glycopeptides and the released glycans from hybrid and pseudohybrid glycans containing potential α3-Gal linkages at m/z 792.7932 revealed up to eight isomers in the murine myeloma cell-derived mAbs ( Fig. 4a and Fig. S12). In light of similar MS/MS fragmentation patterns, the pairs of glycans 1 and 3, 2 and 4, 5 and 6, 7 and 8 were identified as β-and α-anomers, respectively, of each pair of glycans (Fig. S13). Only glycans 1 and 3 showed high abundance of the diagnostic ion at m/z 366.1 ([Gal-GlcNAc + H] + ) and its high-intensity complementary Y fragment at m/z 1219.33 ([MH-365] + ), confirming a Gal-GlcNAc-branched chain at the cored α1,3 antenna of a Man 4 -based hybrid glycan. Exoglycosidase sequencing with α1-2,3 mannosidase revealed compounds 1 and 3 have an unbranched mannose residue localizing at the extended α1,3 arm from the core α1,6 antenna (Fig. S14). In contrast, the MS/MS spectra of glycan isomers of 2 and 4 display the high-intensity Y fragments at m/z 1381.42 with the neutral loss of GlcNAc from the precursor ion at m/z 1584.58, suggesting a GlcNAc branch at the cored α1,3 antenna of Man 5 -based hybrid glycans. It is also interesting to note that the relatively high abundance isomers of 5 and 6, 7 and 8 were extensively detected in murine myeloma cell-derived mAbs (Fig. S12f-j). The occurrence of the most abundance diagnostic ion at m/z 528.18 ([Gal-Gal-GlcNAc + H] + ) and its high-intensity complementary fragment at m/z 1057.37 ([MH-527] + ) indicates a terminal α-Gal residue in the glycan, which was verified by exoglycosidase sequencing by α1-3,6 galactosidase (Fig. S14). Treatments with α1-2,3 mannosidase revealed that the isomers 7 and 8 can be cleaved at the unbranched monomannose residue, and the structure is assigned to a Gal-α1,3-Gal-GlcNAc linkage to the α1,6 antenna (Figs. S12, S14). Correspondingly, isomers 5 and 6 were reasonably assigned to the anomeric structures of the glycan containing were identified by MS/MS fragmentations (Fig. S15). Therefore, analyses of glycan fragmentations unambiguously identified the low-abundance non-human glycans 5-8 containing the α-Gal epitope from the relatively high-abundance conventional human-type glycans 1-4.
Next, extended analyses of positional glycan isomers containing α-Gal residues in bi-and tri-antennary glycans were performed (Fig. 4d,e, Figs. S16-S18). Three pairs of biantennary glycan isomers including humancompatible G2F, and non-human α-Gal-containing G1(3)F and G1(6)F were identified (Fig. S16). Consistent with other complex G1 and G1F glycan isomers as mentioned above, α-Gal residues on the α6-antenna eluted earlier relative to its presence on the α3-antenna, and was further confirmed by the higher intensity of the diagnostic ion following cleavage of the β1,2 glycosidic linkage between the GlcNAc and mannose residue on the α3-antenna (Fig. S16). Using similar analyses, we further discovered several glycan isomers containing mono-, di-and tri-α-Gal linkages of biantennary G2F and triantennary G3F in the mAbs (Figs. S17-S18). www.nature.com/scientificreports/ Structural consistency between mono-Neu5Gc-sialylated and mono-α-galactosylated glycans at region-specific locations. Apart from α-Gal, terminal Neu5Gc is also produced in non-human cells. Further analyses of the released mAb glycans show the mono-Neu5Gc-sialylated glycans are the predominantly anionic glycoforms, although di-Neu5Gc linkages were also present in the biantennary glycans of mAbs produced in murine myeloma cells, consistent with previous reports 24 . The separation of afucosylated and fucosylated sialoglycans by PGC revealed several Neu5Gc-linkage branching isomers in the released underivatized glycan structures (Figs. S19, S20). MS/MS analyses identified several glycans with the typical Man 3-5 -based hybrid structures, which contain the Neu5Gc-sialylated epitope (Fig. 4a-c, Fig. S19a-c, e-g). As expected, the branch-specific isomers of Neu5Gc-sialylated biantennary glycans were reasonably present on the truncated branching complex glycan structures of G1S1, G1FS1 and G2FS1 (Fig. S9d, h, i), which can be differentiated by the diagnostic ions at m/z 673.23 ([Neu5Gc-Gal-GlcNAc + H] + ) and m/z 366.14 ([Gal-GlcNAc + H] + ) and the complementary fragment Y/[Y-18] ions from the neutral losses of branching side chains (Fig. S20c-h). The relative abundances of diagnostic ions are dependent on branching locations in which the cleavage of glycosidic linkage at the α3 antenna is more easily cleaved than the α6 antenna. The Neu5Gc-branched complex glycan at the α3 antenna elutes after the α6 antenna isomer, providing supporting evidence to the aforementioned acceptable criteria of MS/MS glycan sequencing. Interestingly, a side-by-side comparison of the observed glycan structures between mono-Neu5Gc-sialylated and mono-α-galactosylated glycans expressed in mAbs produced in murine myeloma cells showed a high structural consistency at the branch region-specific locations in murine myeloma cell-derived mAbs, as observed in the EICs of the released glycans (Fig. 4). Although the relative abundance of each individual non-human glycan is low, our ability to distinguish their antenna-specific location enables grouping of non-human epitopes based on their location and N-glycan type to determine their overall relative distribution (Table S10, Fig. S21). The specific location of carbohydrate residues in N-glycans can be important in receptor binding affinity and bioactivity [52][53][54] , and whether this remains true for non-human glycan epitopes requires further investigation. Hierarchical clustering analysis further shows the relative similarities in the presence of Neu5Gc vs α-Gal epitopes in different types of glycans (i.e. complex vs. (pseudo)hybrid vs. antenna fucosylated glycans) (Fig. S21).

Heterogeneous termini of α-galactosylated and sialylated glycans.
In addition to singly Neu5Gcbranched sialoglycans, the biantennary sialoglycans bearing heterogeneous termini of Neu5Ac, Neu5Gc and α-Gal at the non-reducing ends were also remarkably detected (Fig. 5). The EIC of the sialoglycan at m/z 1120.906 shows three well-separated peaks at 42.21 min, 44.33 min and 45.33 min (Fig. 5a), in which subtle differences of the glycan structures were identified by MS/MS (Fig. 5b-d). Pairs of the diagnostic ions containing terminal α-Gal and sialic acids were observed as the most common feature of the branch-specific glycan isomers. Similarly, the information obtained from these diagnostic ions and the high abundance complementary fragments can be used for the structural interpretation of branching linkage isomers 32 (Fig. 5b), respectively, we assigned the glycan isomer 9 at 42.21 min to contain a terminal Neu5Ac branch at the α3-antenna and α-Gal branch at the α6-antenna of G2F. The presence of the abundant Y 4α /[Y 4α -18] fragment pair, derived from the consequential losses of Neu5Ac-Gal-GlcNAc and water from the precursor ion, indicates the glycan structure is a β-anomer configuration. In the MS/MS spectrum of isomer 10 at 44.33 min (Fig. 5c) (Fig. 5d), indicating the heterogeneous termini of antenna fucose, α-Gal, Neu5Ac and Neu5Gc residues in the glycan(s). Parallel comparisons of MS/MS spectra among Fig. 5b-d suggest that the mixture of α-anomeric glycan isomer 9 and an antenna-fucosyl regioisomer of glycan 10 co-contribute to the fragmentation pattern of glycans at 45.33 min.
Structural analyses were continued on the glycan isomers consisting of bi-terminal Neu5Gc and α-Gal. PGC LC MS/MS analyses revealed three major peaks in the EIC of the ion at m/z 1128.9045 (Fig. 5e) (Fig. 5f,g), indicating the β-and α-anomers of glycan isomer 11 that comprise a Neu5Gc branch at the 3-antenna and the α-Gal branch at the 6-antenna. On the contrary, the fragmentation of glycan at 50.52 min yielded slightly lower  (Fig. 5h), suggesting an opposite location of the α-Gal and Neu5Gc-substitutions of isomer 12. Glycopeptide sequencing of mAbs showed that these glycans containing di-termini of Neu5Ac, Neu5Gc, α-Gal and antenna fucose are localized in both Fc and Fab regions of the murine cell-derived mAbs (Table S4).

Conclusions
In summary, through the comprehensive characterization of a wide variety of glycan isomers via parallel measurements of glycopeptides and released underivatized glycans using RP and PGC LC MS/MS, we demonstrated the suitability of a highly sensitive glycan sequencing method for a rapid differentiation of low abundance nonhuman glycan isomers from human-compatible glycan counterparts. These methods are capable of accurately determining the site-specific location and structures of mAb glycan isomers, while the highly sensitive sequencing method of glycans is readily available to achieve the throughput, accuracy and reproducibility in analyzing large glycomic datasets. A quantitative comparison of structurally consistent glycan components is feasible to evaluate the distribution of non-human glycan isomers by implementing the relative peak areas of EICs.
In this work, we have also identified sulfation of α-Gal-containing glycans, confirmed the structural consistency of α-galactosylated and Neu5Gc-sialylated biantennary glycans, and a wide distribution of homogenous and heterogeneous non-human epitopes containing mono-and di-termini of Neu5Gc, α-Gal and possible antenna α2/3/4-fucose of mAbs in murine myeloma cells. The structural and functional characterization of glycan attributes are particularly important for regulating the quality of mAb biosimiliars. Our findings increase understanding of the structural diversity and the cellular distributions of non-human glycans of recombinant mAbs. The results provide glycan targets and candidate structures for high-throughput analysis of non-human glycomics, and thus could assist in the quality control of future biotherapeutics production.  Glycoproteomic analyses of mAb glycopeptides using RP C18 column. 100 µg of mAb were reduced with 50 mM DTT at 60 °C and alkylated with 100 mM iodoacetamide for 1 h, followed by dialysis against 10 mM NH 4 HCO 3 and speedvac dry using a refrigerated CentriVap concentrator (Labconco). The proteins were digested at 37 °C overnight using trypsin (Promega) (enzyme-to-protein, w/w, 1/100) in 25 mM NH 4 HCO 3 . Resulting peptides were dried, dissolved in 0.2% FA and analyzed by RP LC MS/MS on an Orbitrap Fusion Lumos coupled with an Easy-nLC 1200 system (Thermo Fisher Scientific) 41 .
Glycomic analyses of released mAb glycans using PGC LC MS/MS. Alkylated proteins were loaded into tube gels followed by in-gel digestion using 1 µL of PNGase F in 25 mM NH 4 HCO 3 at 37 °C overnight. The released glycans were extracted using ACN/water (v/v, 1:1) and ACN. A portion of glycans was further cleaved with α1-2,3 mannosidase or α1-3,6 galactosidase in 50 mM sodium acetate containing 5 mM calcium chloride (pH 5.5) at 37 °C overnight. The products were purified using an in-house made PGC microcolumn.
The dried glycans were dissolved in 0.5% FA, and analyzed by nanospray PGC LC MS/MS on an Orbitrap Fusion mass spectrometer coupled with a nanoAcquity ultra performance LC or Acquity UPLC M-class system (Waters) 32 . The glycans were trapped onto a PGC column (150 µm × 2 cm) and separated with an analytical PGC column (100 µm × 20 cm) at 600 nL min −1 using a 60-min gradient of 10-25% of ACN in 0.1% FA. The Orbitrap MS was acquired at a resolution at 120,000 followed by the ion-trap fragmentation of multiply charged ions using CID at the normalized collision energy of 25%.

Construction of a mAb N-glycan library for glycopeptide identification. The in-house mAb
N-glycan library was constructed based on the common structures of N-glycans derived from the glycan biosynthesis pathway (Fig. S22). Biantennary and triantennary N-glycans were selected from literatures and LC MS/MS analyses of mAb drugs. Bisected N-acetylglucosamine (GlcNAc) glycans were incorporated for mAbs derived from human cells, although non-human mammalian cells are known to lack the gene encoding GNT-III for production of GlcNAc-bisecting N-glycans. Possible modifications were considered including phosphorylation of high-mannose N-glycans, sulfation of complex N-glycans and O-acetylation of sialic acids 32 . Protein sequences were collected from the Drugbank (https:// www. drugb ank. ca). Tryptic glycopeptides were identified by database search using Byonic software (Protein Metrics Inc.) and verified by manual inspection 41 . MS/MS sequencing by diagnostic ions and complementary fragments was also utilized to elucidate unknown glycopeptides for the structural identification and in-depth characterization of isobaric glycopeptides separated by RP LC MS/MS. Elucidation of structural glycan isomers. Isomeric glycans were assigned according to the knowledge of the glycan biosynthetic pathway, diagnostic ions, accurate masses and MS/MS fragment patterns of glycans 32,55 . Glycan structures were further validated by LC elution order of glycan isomers and exoglycosidase sequencing. To simplify data interpretation, the glycan structures in Figs. 4 and 5 were numbed in the order of their presences, and those numbers are inconsistent with the labels of glycans in the library (Fig. S22).
Briefly, egg yolk isolated from 20 eggs was washed for 3 times with 400 mL of diethyl ether followed by vigorous stirring and filtration. SGP was extracted from the resulting solid using 400 mL of 40% acetone twice with stirring overnight. The supernatant was filtered, washed again with 200 mL of 40% acetone, evaporated and concentrated to afford an off-white solid. The solid was resuspended in water, and passed through a Buchner funnel containing 15 g of activated carbon, and washed with 2.5% acetonitrile. SGP was eluted with 25% acetonitrile containing 0.1% TFA. The eluted sialoglycans were subsequently lyophilized. Glycans were released from SGP using PNGase F and the resulting sialoglycans were desialylated using α2-3,6,8 neuraminidase. 20 mg of SGP solid was dissolved in 1.8 mL of water, followed by addition of 0.2 mL of 50 mM sodium acetate (NaOAc) containing 5 mM CaCl 2 (pH 5.5). PNGase F (5 µL) and neuraminidase (2 µL) were then added and the mixture was incubated overnight at 37 °C. The reaction progress was monitored by high pH anionic chromatographypulsed amperometric detection (HPAEC-PAD, Dionex IC-3000) using a Dionex CarboPac PA200 IC column (3 mm × 250 mm, 5.5 μm particle size) and a Gold Standard PAD waveform with a AgCl electrode. A LC gradient of 20%/0%/80% over 10 min to 60%/20%/20% of MP-A/MP-B/MP-C (MP, mobile phase; MP-A: 200 mM NaOH; MP-B: 150 mM NaOAc in 200 mM NaOH; MP-C: H 2 O) was used to separate glycans. G1(3) and G2 glycans eluted at 9.5 and 9.9 min, respectively. The glycans were further purified using HPLC equipped with a HyperCarb PGC column (Thermo Fisher, 10 mm × 150 mm, 5 µm particle size) and a LC gradient of 6.0% to 16 www.nature.com/scientificreports/ while the G2 anomers eluted at 30.3 min and 35.9 min. Glycan compositions purified from each peak were confirmed by HPAEC-PAD analysis with commercially available G1 and G2 standards, and identified by LC MS/ MS. To prepare G1(6) glycan, the purified G2 glycans (1.1 mg) were degalactosylated by dissolving in 200 µL of water, 30 µL of 0.5 M KCl and 30 µL of 50 mM NaOAc (pH 6.0). 30 µL of LacZ β-galactosidase was then added, and the solution was incubated at 30 °C. The reaction was completed in 42 h, as analyzed by HPAEC-PAD, resulting in 25% of G0 (completely degalactosylated) and 61% of a mixture of G1(6) and G1 (3)). These products were then purified by HPLC using a HyperCarb PGC column (Thermo Fisher, 4.6 mm × 150 mm, 5 µm particle size) and a LC gradient of 4.0% to 13.7% MP A: MP B (MP A: 95% acetonitrile in 49% water containing 0.1% TFA; MP B: 0.1% TFA) at 2 mL min −1 over 41 min. The β-and α-anomers of G0 glycans eluted at 26.3 min and 33.0 min, respectively, while G1(6) anomers eluted at 30.0 min and 36.3 min and G1(3) anomers eluted at 31.0 and 37.5 min. Glycans were monitored at 214 nm using a UV detector, collected using a fraction collector and lyophilized. Purified glycans were confirmed by HPAEC-PAD analysis with commercially available G0 and G1 standards, and subsequent measurements by LC MS/MS.

Data availability
The datasets used and/or analyzed during the current study available from the corresponding authors on reasonable request. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.