Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Correspondence
  • Published:

Mass spectrometrists should search for all peptides, but assess only the ones they care about

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Comparison of methods on the Plasmodium falciparum example (Plasmodium subset).

References

  1. Noble, W.S. Nat. Methods 12, 605–608 (2015).

    Article  CAS  Google Scholar 

  2. Kim, S. & Pevzner, P.A. Nat. Commun. 5, 5277 (2014).

    Article  CAS  Google Scholar 

  3. Elias, J.E. & Gygi, S.P. Nat. Methods 4, 207–214 (2007).

    Article  CAS  Google Scholar 

  4. Bourgon, R., Gentleman, R. & Huber, W. Proc. Natl. Acad. Sci. USA 107, 9546–9551 (2010).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This research was supported by the Ghent University Multidisciplinary Research Partnership “Bioinformatics: from nucleotides to network,” VLAIO SBO grant “INSPECTOR” (120025) and the concerted Research Action BOF12/GOA/014, Ghent University.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Lennart Martens or Lieven Clement.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Comparison of methods on the Pyrococcus furiosus example

Histograms of MS-GF+ scores (grey) with estimated number of correct PSMs (#target - #decoys, red) and incorrect PSMs (#decoys, blue), 1% FDR cutoff (dashed line) for the search-all-assess-all (all-all, panel a), the search-subset-assess-subset (sub-sub, panels b, d) and the search-all-assess-subset strategy (all- sub, panel c). The GO term “ATP binding” was used to generate the subset of interest. In panel (a) and (c), the spectra are searched against the complete Pyrococcus database and in panel (b) and (d) against an “ATP binding proteins” only database. Panel (a) shows PSM scores for all PSMs, panels (b) - (d) for the “ATP binding proteins” subset, only. The fraction of incorrect PSMs (π0, first mode in the target distribution) is lower in the complete Pyrococcus set (all-all, panel a) than in the “ATP binding proteins” subset (all-sub, panel c) indicating that the all-all FDR is too liberal. The 1% FDR cutoff in the subsub strategy (panel b) shifted to higher values and this leads to a decrease in the number of subset PSMs found compared to the all-all strategy and the all-sub strategy. It also shows that sub-sub forces many PSMs on incorrect subset PSMs (orange bars in panel d). Indeed 6110 (8617-2507) spectra matching to other Pyrococcus targets/decoys in the complete Pyrococcus search switch to an “ATPbinding” sequence in the subset-search. 1.4% of the sub-sub PSMs above the 1% FDR cutoff have switched peptides sequences (panel d orange) as compared to the complete search (all-all and all-sub strategies). They have much lower scores than in a complete search and are questionable at best (black and orange boxplot below histogram in panel d).

Source data

Supplementary Figure 2 Comparison of methods on the Plasmodium falciparum example (human-subset).

Histograms of MS-GF+ scores (grey) with estimated number of correct PSMs (#target - #decoys, red) and incorrect PSMs (#decoys, blue), 1% FDR cutoff (dashed line) for the search-all-assess-all (all-all) (a), the search-subset-assess-subset (sub-sub) (b, d) and the search-all-assess-subset strategy (all-sub) (c). In panel (a) and (c), the spectra are searched against a human + Plasmodium database (complete search) and in panel (b) and (d) against a human only database. Panel (a) shows PSM scores for both human and Plasmodium, panels (b) – (d) for the human subset, only. The fraction of incorrect PSMs (π0, first mode in the target distribution) is lower in the human + Plasmodium set (all-all, panel a) than in the human subset (all-sub, panel c) indicating that the all-all FDR is too liberal. The 1% FDR cutoff in the sub-sub strategy (panel b) shifted to higher values and this leads to a decrease in the number of subset PSMs found compared to the all-all and all-sub strategy. It also shows that many PSMs are forced on incorrect subset PSMs in the sub-sub strategy (huge first mode of the distribution). Indeed 13553 (30286-16733) spectra matching to Plasmodium targets/decoys in the human + Plasmodium search switched to a human sequence in the subset-search. 1.3% of the sub-sub PSMs above the 1% FDR cutoff switched peptide sequences (panel d orange) as compared to the complete search (all-all and all-sub strategies). They have lower scores than in a complete search and are questionable at best (black and orange boxplots below histogram in panel d). Sub-sub puts an even higher burden on the target decoy approach for the human-subset than for the Plasmodium-subset (Figure 1 in the main manuscript and Supplementary Figure 3) because more high-quality Plasmodium spectra occur in the sample increasing the problem of forced-PSMs. Note, that the results for the human subset are only included to illustrate that poor FDR control of all-all and sub-sub is not due to a specific choice of the subset. Also note that we do not advocate the use of all-sub on all possible subsets and to combine their results.

Supplementary Figure 3 Histograms of MS-GF+ scores (grey) for the search-subset-assess-subset (sub-sub) method in the Plasmodium falciparum example (Plasmodium-subset).

Common PSMs (green) and PSMs that switched peptides sequences (orange) in the sub-sub search (Plasmodium database) as compared to the complete search (human + Plasmodium database). It shows that the sub-sub strategy forces many PSMs on incorrect subset PSMs (huge first mode of the distribution). 0.6% of the sub-sub PSMs above the 1% FDR cutoff switched peptide sequences (orange) as compared to the complete search. Moreover, they have lower scores than in a complete search and are questionable at best (black and orange boxplot below histogram).

Supplementary Figure 4 Boxplot of the fractions of subset PSMs that matched a different peptide sequence in the complete (all-all) and the subset search (subsub) for 36 different GO subsets of the Pyrococcus furiosus example.

PSM-subset-lists were constructed at 1% (panel a) or 5% FDR (panel b). Vertical grey line indicates the FDR cutoff. Since PSMs always have a higher score in the complete search, we assume that the match in the subset search is likely a false positive. Most subsets return a higher fraction of switched PSMs then the given FDR cutoff. This suggests that the sub-sub strategy suffers from an inaccurate FDR control for most GO subsets.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–4 and Supplementary Methods

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sticker, A., Martens, L. & Clement, L. Mass spectrometrists should search for all peptides, but assess only the ones they care about. Nat Methods 14, 643–644 (2017). https://doi.org/10.1038/nmeth.4338

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.4338

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics