Research abstract

Letter abstract

Nature Biotechnology 24, 1285 - 1292 (2006)
Published online: 10 September 2006 | doi:10.1038/nbt1240

A probability-based approach for high-throughput protein phosphorylation analysis and site localization

Sean A Beausoleil1, Judit Villén1, Scott A Gerber2, John Rush3 & Steven P Gygi1


Data analysis and interpretation remain major logistical challenges when attempting to identify large numbers of protein phosphorylation sites by nanoscale reverse-phase liquid chromatography/tandem mass spectrometry (LC-MS/MS) (Supplementary Figure 1 online). In this report we address challenges that are often only addressable by laborious manual validation, including data set error, data set sensitivity and phosphorylation site localization. We provide a large-scale phosphorylation data set with a measured error rate as determined by the target-decoy approach, we demonstrate an approach to maximize data set sensitivity by efficiently distracting incorrect peptide spectral matches (PSMs), and we present a probability-based score, the Ascore, that measures the probability of correct phosphorylation site localization based on the presence and intensity of site-determining ions in MS/MS spectra. We applied our methods in a fully automated fashion to nocodazole-arrested HeLa cell lysate where we identified 1,761 nonredundant phosphorylation sites from 491 proteins with a peptide false-positive rate of 1.3%.

  1. Department of Cell Biology, Harvard Medical School, 240 Longwood Ave., Boston, Massachusetts 02115, USA.
  2. Department of Genetics and Norris Cotton Cancer Center, Lebanon, New Hampshire 03755, USA.
  3. Cell Signaling Technology, Inc., Beverley, Massachusetts 01915, USA.

Correspondence to: Steven P Gygi1 e-mail:


These links to content published by NPG are automatically generated.


Valid data from large-scale proteomics studies

Nature Methods News and Views (01 Sep 2005)