Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

Thesaurus: quantifying phosphopeptide positional isomers

Abstract

Proteins can be phosphorylated at neighboring sites resulting in different functional states, and studying the regulation of these sites has been challenging. Here we present Thesaurus, a search engine that detects and quantifies phosphopeptide positional isomers from parallel reaction monitoring and data-independent acquisition mass spectrometry experiments. We apply Thesaurus to analyze phosphorylation events in the PI3K/AKT signaling pathway and show neighboring sites with distinct regulation.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Phosphopeptide detection with Thesaurus.
Fig. 2: Detection and quantification of IRS1 phosphorylation.

Similar content being viewed by others

Data availability

All mass spectrometry files (Supplementary Dataset 3) presented here have been deposited to the Chorus Project (https://chorusproject.org/) with the project identifier 1374 and at the MassIVE proteomics repository (https://massive.ucsd.edu/) with project identifier MSV000082956. The high-resolution human phosphopeptide spectrum library used in this study is available through Phosphopedia (https://phosphopedia.gs.washington.edu/PhosphoproteomicsAssay/sections/elib.xhtml).

References

  1. Boersema, P. J., Foong, L. Y. & Ding, V. M. et al. Mol. Cell Proteom. 9, 84–99 (2010).

    Article  CAS  Google Scholar 

  2. Villén, J., Beausoleil, S. A., Gerber, S. A. & Gygi, S. P. Proc. Natl Acad. Sci. USA 104, 1488–1493 (2007).

    Article  Google Scholar 

  3. Schweiger, R. & Linial, M. Biol. Direct 5, 6 (2010).

    Article  Google Scholar 

  4. Huang, C. Y. & Ferrell, J. E. Proc. Natl Acad. Sci. USA 93, 10078–10083 (1996).

    Article  CAS  Google Scholar 

  5. Nash, P., Tang, X. & Orlicky, S. et al. Nature 414, 514–521 (2001).

    Article  CAS  Google Scholar 

  6. Chiu, J. C., Ko, H. W. & Edery, I. Cell 145, 357–370 (2011).

    Article  CAS  Google Scholar 

  7. Liu, Y. F., Herschkovitz, A. & Boura-Halfon, S. et al. Mol. Cell Biol. 24, 9668–9681 (2004).

    Article  CAS  Google Scholar 

  8. Beausoleil, S. A., Villén, J., Gerber, S. A., Rush, J. & Gygi, S. P. Nat. Biotechnol. 24, 1285–1292 (2006).

    Article  CAS  Google Scholar 

  9. Peterson, A. C., Russell, J. D., Bailey, D. J., Westphall, M. S. & Coon, J. J. Mol. Cell Proteom. 11, 1475–1488 (2012).

    Article  Google Scholar 

  10. Venable, J. D., Dong, M. Q., Wohlschlegel, J., Dillin, A. & Yates, J. R. Nat. Methods 1, 39–45 (2004).

    Article  CAS  Google Scholar 

  11. Rosenberger, G., Liu, Y. & Röst, H. L. et al. Nat. Biotechnol. 35, 781–788 (2017).

    Article  CAS  Google Scholar 

  12. Röst, H. L., Rosenberger, G. & Navarro, P. et al. Nat. Biotechnol. 32, 219–223 (2014).

    Article  Google Scholar 

  13. Meyer, J. G., Mukkamalla, S., Steen, H., Nesvizhskii, A. I., Gibson, B. W. & Schilling, B. Nat. Methods 14, 646–647 (2017).

    Article  CAS  Google Scholar 

  14. Tsou, C. C., Avtonomov, D. & Larsen, B. et al. Nat. Methods 12, 258–64 (2015). 7 p following 264.

    Article  CAS  Google Scholar 

  15. Peckner, R., Myers, S. A. & Jacome, A. S. V. et al. Nat. Methods 15, 371 (2018).

    Article  CAS  Google Scholar 

  16. Searle, B. C., Pino, L. K. & Egertson, J. D. et al. Nat. Commun. 9, 5128 (2018).

    Article  Google Scholar 

  17. Lawrence, R. T., Searle, B. C., Llovet, A. & Villén, J. Nat. Methods 13, 431–434 (2016).

    Article  CAS  Google Scholar 

  18. Yi, Z., Luo, M., Carroll, C. A., Weintraub, S. T. & Mandarino, L. J. Anal. Chem. 77, 5693–5699 (2005).

    Article  CAS  Google Scholar 

  19. Luo, M., Langlais, P. & Yi, Z. et al. Endocrinology 148, 4895–4905 (2007).

    Article  CAS  Google Scholar 

  20. Frewen, B. E., Merrihew, G. E., Wu, C. C., Noble, W. S. & MacCoss, M. J. Anal. Chem. 78, 5678–5684 (2006).

    Article  CAS  Google Scholar 

  21. MacLean, B., Tomazela, D. M. & Shulman, N. et al. Bioinformatics 26, 966–968 (2010).

    Article  CAS  Google Scholar 

  22. Olsen, J. V., Blagoev, B. & Gnad, F. et al. Cell 127, 635–648 (2006).

    Article  CAS  Google Scholar 

  23. Savitski, M. M., Lemeer, S. & Boesche, M. et al. Mol. Cell Proteom. 10, M110.003830 (2011).

    Article  Google Scholar 

  24. Taus, T., Köcher, T. & Pichler, P. et al. J. Proteome Res. 10, 5354–5362 (2011).

    Article  CAS  Google Scholar 

  25. Fermin, D., Walmsley, S. J., Gingras, A. C., Choi, H. & Nesvizhskii, A. I. Mol. Cell Proteom. 12, 3409–3419 (2013).

    Article  CAS  Google Scholar 

  26. Ma, C. W. & Lam, H. J. Proteome Res. 13, 2262–2271 (2014).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We would like to thank members of the Villén and MacCoss laboratories for useful discussions. We would also like to thank G. Rosenberger, H. Röst, J. Meyer and B. Schilling for insightful conversations about detecting positional isomers. B.C.S. was supported by grant no. F31 GM119273. R.T.L. was supported by a Samuel and Althea Stroum Endowed Graduate Fellowship. This work is supported by NIH grants P41 GM103533, R21 CA192983 and U54 HG008097 to M.J.M. and R35 GM119536, R01 AG056359 to J.V., as well as a research grant from the W.M. Keck Foundation to J.V.

Author information

Authors and Affiliations

Authors

Contributions

R.T.L. and J.V. conceived the study. B.C.S., R.T.L., and J.V. designed the experiments. B.C.S. and R.T.L. performed the experiments. B.C.S. designed and wrote the software, and analyzed the data. R.T.L. generated the spectrum library. M.J.M. and J.V. supervised the work. B.C.S., R.T.L., M.J.M. and J.V. wrote the paper.

Corresponding author

Correspondence to Judit Villén.

Ethics declarations

Competing interests

The MacCoss Laboratory at the University of Washington has a sponsored research agreement with Thermo Fisher Scientific, the manufacturer of the instrumentation used in this research. Additionally, M.J.M. is a paid consultant for Thermo Fisher Scientific and B.C.S. is a shareholder and paid consultant of Proteome Software.

Additional information

Peer review information: Allison Doerr was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Thesaurus algorithmic workflow to search for positional isomers from DIA data.

For each phosphopeptide in a spectrum library (for example MQSLSLNK from PDAP1), Thesaurus (a) determines the combinatorial list of potential positional isomers and (b) generates synthetic library spectra for positional isomers that are missing from the library. (c) Then for each isomer, Thesaurus calculates primary scores at each retention time point using all fragment ions in the library spectrum. (d) Starting with the highest scoring isomer/retention time pair, (e) Thesaurus calculates a pairwise p-value for detecting that isomer at that retention time versus every other potential isomer using only site-specific fragment ions. The localization p-value is the least significant of those pairwise comparisons. (f) Afterwards, Thesaurus checks the phosphopeptide detection to make sure a sufficient number of fragment ions follow the peak shape described by the site-specific ions. (g) Isomers are reserved for FDR analysis if they pass user-specified localization p-value and IonCount score thresholds. However, if no isomer passes these thresholds for a given phosphopeptide, the isomer with the lowest p-value is reserved. Thesaurus iterates to the next highest scoring isomer/retention time pair, and repeats steps e-g until all the potential positional isomers are considered. After every phosphopeptide is considered, the detected positional isomers are processed with Percolator, and localization p-values for passing isomers are also independently Benjamini-Hochberg FDR corrected.

Supplementary Figure 2 Interference in DIA experiments is increased relative to DDA.

(a) The frequency in the DDA library (N=1) that fragment ions are present (+/- 10 ppm) in each MS/MS with precursors of m/z from 590.5183 to 600.5229. (b) The frequency that fragment ions are present (+/- 10 ppm) in each MS/MS scan generated for the 590.5183 to 600.5229 m/z isolation window in DIA replicates (N=4). (c) The relative ion frequency for the most common ion at each 1 m/z bin in both DIA and DDA. While many fragment ions found 1/100 or less are relatively consistent between the two data sets, many fragment ions show dramatically higher frequency in DIA spectra than their DDA counterparts. (d) Changes in the frequency distribution across precursor isolation windows and at different fragment mass tolerances. The frequency of higher mass fragment ions is increased with high mass precursor window and higher fragment mass tolerance (N=4).

Supplementary Figure 3 The number of phosphopeptides detected from the SWATH-MS DIA synthetic phosphopeptide reference set.

We performed analyses of this data set (N=1) using Thesaurus, OpenSwath/IPF (Inference of PeptidoForms), and two methods based on DIA-Umpire: PIQED (using PTMProphet), and Ascore and filtered results to an estimated 5% localization FDR in each algorithm. Library matches were marked as correct if they corresponded to expected synthetic positional isomers (579 of 1262 total). Thesaurus is able detect an additional 11 positional rearrangements (see Supplementary Fig. 5 for more details). For the purposes of this comparison the detection of these manually curated isomers does not count as either a correct or an incorrect match to ensure a fair comparison. Manual curation of positional rearrangements required that the rearranged positional isomer was a) detected at the same retention time as the expected isomer, and b) contained at least two site-specific fragment ions that fit the shape of the expected isomer without any interference. Comet is able to make detections to peptides not present in the library. Again, to ensure a fair comparison, these detections also do not count as either correct or incorrect.

Supplementary Figure 4 Phosphopeptide gas-phase rearrangement of the peptide GIRPpSPLENSHR.

Thesaurus was able to detect 11 products of a gas-phase phosphate rearrangement from the SWATH-MS DIA synthetic phosphopeptide reference set (N=1). These products were detected at the same retention time as the expected isomer and generated at least two site-specific fragment ions. (a) Phosphopeptide GIRPSPLENSHR sequence and theoretical fragment ions, with 10 site localizing ions: b5 to b9 and y4 to y8. (b) All of the localizing ions are observed in the Rosenberger et al SWATH-MS DIA synthetic phosphopeptide reference set for the isomer GIRPpSPLENSHR. (c) Four site localizing ions for the alternate isomer, GIRPSPLENpSHR, are also observable at the same elution time. Unlike localization algorithms that compete positional isomers against each other, Thesaurus is able to assign p-values to each variant independently. Thesaurus then flags co-eluting isomers and provides capabilities to both quantify the extent of gas-phase phosphate rearrangements and distinguish rearrangements from truly co-eluting phosphopeptides using quantitative fragment ion ratios. In addition to GIRPpSPLENSHR, 10 other peptides were flagged as producing positional isomers.

Supplementary Figure 5 The number of confidently observed positional isomers for singly phosphorylated phosphopeptides.

Previously we reported a phosphopeptide database based on over one thousand DDA experiments from four labs. In this work we created a spectrum library with a subset of this database, selecting from runs acquired in house and on a Q-Exactive mass spectrometer. This spectrum library contains 82,029 distinct positional isomers detected at a 1% FDR level (N=1). For this figure we required that each peptide was observed at least 50 times in this library to avoid undersampling issues, and each isomer was site localized to Ascore>13 (p-value<0.05).

Supplementary Figure 6 The number of phosphopeptides detected by Thesaurus as positional isomers.

(a) The average (bars) and number (circles) of unique phosphopeptide sequences and fully localized isomers (p-value<0.01) using Thesaurus across four HeLa technical DIA replicates. (b) The average percent (bars) and individual percent (circles) of peptides that contained multiple serines, threonines, or tyrosines that could be fully localized to one or multiple isomers (N=4). (c) The number of localized phosphopeptides with retention time differences between positional isomers indicated in the x-axis.

Supplementary Figure 7 DDA spectra showing different localizations of the early and late eluting singly phosphorylated AITGASLADIMAK peptide from RPL24.

Ion detections from (a) late eluting peptide assigned to AIpTGASLADIMAK (Ascore=46.2) and (b) early eluting peptide assigned to AITGApSLADIMAK (Ascore=30.7) are contrasted with the incorrect localization for (c) late eluting as AITGApSLADIMAK and (d) early eluting as AIpTGASLADIMAK (N=1).

Supplementary Figure 8 Significantly changing MCF-7 phosphopeptides after insulin and IGF-1 stimulation.

(a) Heat map of 2273 significantly changing localized phosphopeptides. Changes were identified with one-way ANOVAs from all six conditions and six cell culture replicates, filtered to a Benjamini-Hochberg corrected FDR<0.05. Red indicates >=2-fold up while blue indicates <=2-fold down. These peptides were K-means clustered into five groups. (b) Integrated intensities for FOXO3A pS253, PRAS40 pT246, AS160 S588, and TSC2 S939 all follow the canonical AKT RXRXX[pS/pT] motif and are known to be directly phosphorylated by AKT. Boxes indicate quartiles and medians, while whiskers indicate the estimated 5% and 95% ranges; N=6.

Supplementary Figure 9 Scatterplot of retention time and quantitative differences between localized MCF-7 phosphopeptide isomers.

Quantitative differences between isomers were estimated as the Benjamini-Hochberg FDR corrected Thesaurus p-value that the ratios of the isomer pairs after stimulation were pairwise consistent using a two-tailed t-test across all 24 test/control comparisons for each peptide (4 comparisons: control/DMSO vs. IGF-1/DMSO, control/DMSO vs. insulin/DMSO, control/MK-2206 vs. IGF-1/MK-2206, control/MK-2206 vs. insulin/MK-2206 (N=6). As with HeLa, the majority of peptide isomers in MCF-7 elute within 60 seconds of each other. Most isomer pairs either do not significantly change after stimulus or do not differ from each other in relative expression profiles (white area). These sites are representative of either redundant function, background phosphorylation, or even indicate gas-phase rearrangement if the retention time differences are small enough. However, several positional isomers significantly change in different directions (light blue area), and these suggest differential function.

Supplementary Figure 10 Dynamic response upon pathway stimulation or kinase inhibition can differ significantly between two positional isomers of the same peptide, indicating differential regulation.

Phosphorylation of eIF4G (a) S1185 by PKCα is thought to induce binding to MNK1, but the function of S1187 (b) is unknown. The functional relevance of both phosphosites on RBM34 (c and d) are unknown, but they respond to insulin/IGF-1 in opposite directions. In the Rho GTPase activating protein ARHGAP5 (e and f), phosphorylation of both S1195 and S1202 increase with insulin/IGF-1, but only S1195 changes as a result of AKT inhibition. (g) S16 phosphorylation of STMN1 responds equally to insulin and IGF-1, while (h) S25 responds much more significantly to IGF-1. Boxes indicate quartiles and medians, while whiskers indicate the estimated 5% and 95% ranges. FDR is calculated as the Benjamini-Hochberg corrected ANOVA test between six conditions with six replicates.

Supplementary Figure 11 Validation of MS and MS/MS-level quantitation of IRS1 positional isomer phosphopeptides with PRM.

(a) Example PRM fragment ion chromatograms of one replicate confirm the detection of three positional isomers of phosphorylated KGSGDYMPMSPK from IRS1 using +/- 0.7 m/z precursor isolation (as compared to +/- 10 m/z precursor isolation in the DIA experiments). (b) Phosphopeptide intensity ratios between insulin and control samples derived with targeted PRM (gold standard MS/MS quantification), and DIA with both MS and MS/MS-based quantification. DIA MS and MS/MS quantification for pS629 and pS636 are relatively accurate. While DIA-based quantitation for pY632 is compressed, MS/MS quantification is subject to less interference than MS (precursor) quantification and shows less ratio compression.

Supplementary Figure 12 Differential expression of singly and doubly phosphorylated KGSGDYMPMSPK.

Box plots and values indicating summed fragment ion intensities for (a) singly phosphorylated KGpSGDYMPMSPK, (b) singly phosphorylated KGSGDYMPMpSPK, and (c) doubly phosphorylated KGpSGDYMPMpSPK across six biological replicates after stimulation with insulin, IGF-1, or unstimulated (control); with and without MK-2206. Boxes indicate quartiles and medians, while whiskers indicate the estimated 5% and 95% ranges. Intensity values are block normalized within phosphopeptide positional isomers using a linear model in order to keep measurements on the same scale.

Supplementary Figure 13 Differential expression of co-eluting phosphopeptide positional isomers.

Several phosphopeptides are indistinguishable by retention time. However, Thesaurus can detect and independently quantify both positional isomers using site-specific ions. Here we show fragment ion chromatograms for positional isomers of MARK3 with phosphorylation on S469 (a) or S476 (b). The fragment ion chromatograms on the left are representative of the control group (N=6) and the right chromatograms are representative data from the IGF-1/MK-2206 treated cells (N=6). Solid lines are site-specific ions, and dashed lines are shared ions. On the right side we show the integrated intensities for those positional isomers in the different experimental conditions (N=6). Boxes indicate quartiles and medians, while whiskers indicate the estimated 5% and 95% ranges. Phosphorylation of S469 additively decreases upon insulin/IGF-1 and AKT inhibition (FDR=3.6e-7), while phosphorylation of S476 remains constant (FDR=0.54). (c) Log2 ratios of site-specific fragment ion intensities with at least 1e5 total integrated signal. We find that with the exception of one ion (y17+2H for S476), site-specific fragment ions visually cluster into two groups corresponding to S469 (red lines) and S476 (blue lines) phosphorylation quantitative levels.

Supplementary information

Supplementary Information

Supplementary Figs. 1–13 and Supplementary Note

Reporting Summary

Supplementary Dataset 1

Phosphopeptide identifications and site localization statistics using Thesaurus on the HeLa DIA data. This table summarizes the localization scores and statistics for Fig. 1 and Supplementary Fig. 6.

Supplementary Dataset 2

Phosphopeptide quantification results with DIA and Thesaurus for the insulin/IGF-1 experiment in MCF-7 cells. The first table summarizes the median values across replicates for significantly changing phosphopeptides (FDR < 0.05). This data was used to generate the heat map in Supplementary Fig. 8a). Second table contains the quantitative intensity reported for every replicate for the 7701 consistently detected phosphopeptides. This data was used to generate the box plots in Fig. 2 and Supplementary Figs. 8b, 10, 12, and 13.

Supplementary Dataset 3

Table with sample names, descriptions, and Chorus identifiers. This table summarizes the experimental condition, replicate number and data acquisition method for each raw mass spectrometry data file used in this work, linked to the Chorus Project identifiers.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Searle, B.C., Lawrence, R.T., MacCoss, M.J. et al. Thesaurus: quantifying phosphopeptide positional isomers. Nat Methods 16, 703–706 (2019). https://doi.org/10.1038/s41592-019-0498-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-019-0498-4

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research