More than half of all human proteins are glycosylated. This covalent attachment of carbohydrates can alter protein activity. However, existing methods for studying the glycoproteome typically require laborious sample processing and proprietary software. They also often require enzymatic removal of the glycan, which makes it difficult to identify attachment sites.

Josef Penninger and colleagues at the Institute of Molecular Biotechnology of the Austrian Academy of Sciences developed a method to identify intact glycopeptides in proteomic data. Focusing on the intact fragments allows both the structure of the glycan and the attachment site in the associated protein to be identified.

In their approach, glycopeptides from cell lysates are enriched by hydrophilic interaction chromatography and profiled by nano-liquid chromatography electrospray-ionization tandem mass spectrometry. The team applied an algorithm to convert signals from multiply charged fragment ions to single charges, which shifted fragment ions for intact glycopeptides to higher mass ranges, whereas the peptide backbone fragment ions remained in the lower mass range; a database search algorithm allowed the researchers to identify intact glycopeptides from such processed spectra.

Using this technique, the authors characterized the glycoproteome of mouse and human embryonic stem cells. 3,380 glycopeptides mapping to about 500 different glycoproteins were found in mouse, which almost doubled the number of confirmed mouse N-glycoproteins; whereas about 1,100 glycopeptides mapping to 576 proteins were found in the human samples.

Penninger and his team identified conserved and species-specific glycoproteins, and they discovered novel glycosylation of stemness factors. They also found several nuclear envelope and cytosolic proteins to be N-glycosylated and discovered multiple new nuclear O-glycoproteins. The type of glycan linkage differed between species, with galactose and N-acetylhexosamine being the predominant terminal sugars on complex N-glycans of mouse, whereas human glycans tended to be sialylated.

The authors made a collection of software tools to automate glycopeptide identification that is available in a package called SugarQb, and they anticipate that it will allow rapid characterization of the glycoproteome from diverse species.