Shotgun proteomics uses liquid chromatography–tandem mass spectrometry to identify proteins in complex biological samples. We describe an algorithm, called Percolator, for improving the rate of confident peptide identifications from a collection of tandem mass spectra. Percolator uses semi-supervised machine learning to discriminate between correct and decoy spectrum identifications, correctly assigning peptides to 17% more spectra from a tryptic Saccharomyces cerevisiae dataset, and up to 77% more spectra from non-tryptic digests, relative to a fully supervised approach.
This is a preview of subscription content
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Eng, J.K., McCormack, A.L. & Yates, J.R. III. J. Am. Soc. Mass Spectrom. 5, 976–989 (1994).
Perkins, D.N., Pappin, D.J.C., Creasy, D.M. & Cottrell, J.S. Electrophoresis 20, 3551–3567 (1999).
MacCoss, M.J., Wu, C.C. & Yates, J.R. III. Anal. Chem. 74, 5593–5599 (2002).
Keller, A., Nezvizhskii, A.I., Kolker, E. & Aebersold, R. Anal. Chem. 74, 5383–5392 (2002).
Moore, R.E., Young, M.K. & Lee, T.D. J. Am. Soc. Mass Spectrom. 13, 378–386 (2002).
Peng, J., Elias, J.E., Thoreen, C.C., Licklider, L.J. & Gygi, S.P. J. Proteome Res. 2, 43–50 (2003).
Anderson, D.C., Li, W., Payan, D.G. & Noble, W.S. J. Proteome Res. 2, 137–146 (2003).
Boser, B.E., Guyon, I.M. & Vapnik, V.N. A training algorithm for optimal margin classifiers. in 5th Annual ACM Workshop on COLT (ed. Haussler, D.) 144–152 (ACM Press, Pittsburgh, Pennsylvania, USA, 1992).
Storey, J.D. & Tibshirani, R. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).
Tabb, D.L., McDonald, W.H. & Yates, J.R. III. J. Proteome Res. 1, 21–26 (2002).
Washburn, M.P., Wolters, D. & Yates, J.R. III. Nat. Biotechnol. 19, 242–247 (2001).
This work was funded by US National Institutes of Health grants P41 RR011823 and R01 EB007057.
About this article
Cite this article
Käll, L., Canterbury, J., Weston, J. et al. Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat Methods 4, 923–925 (2007). https://doi.org/10.1038/nmeth1113
Journal of Hematology & Oncology (2022)
Integrated proteomic analysis of low-grade gliomas reveals contributions of 1p-19q co-deletion to oligodendroglioma
Acta Neuropathologica Communications (2022)
GATD3A, a mitochondrial deglycase with evolutionary origins from gammaproteobacteria, restricts the formation of advanced glycation end products
BMC Biology (2022)
BMC Bioinformatics (2022)
Nature Communications (2022)