SAINT: probabilistic scoring of affinity purification–mass spectrometry data

Journal name:
Nature Methods
Volume:
8,
Pages:
70–73
Year published:
DOI:
doi:10.1038/nmeth.1541
Received
Accepted
Published online

We present 'significance analysis of interactome' (SAINT), a computational tool that assigns confidence scores to protein-protein interaction data generated using affinity purification–mass spectrometry (AP-MS). The method uses label-free quantitative data and constructs separate distributions for true and false interactions to derive the probability of a bona fide protein-protein interaction. We show that SAINT is applicable to data of different scales and protein connectivity and allows transparent analysis of AP-MS data.

At a glance

Figures

  1. Probability model in SAINT.
    Figure 1: Probability model in SAINT.

    (a,b) Interaction data in the presence (a) and absence (b) of control purifications. Schematic of the experimental AP-MS procedure is shown at the top and a spectral count interaction table is illustrated at the bottom. Ctrl, control; rep, replicate; freq, frequency. (c) Modeling spectral count distributions for true and false interactions. For the interaction between prey i and bait j, SAINT uses all relevant data for the two proteins, as shown in the column of the bait (green) and the data in the row of the prey (orange) in a and b. (d) Probability is calculated for each replicate by application of Bayes rule, and a summary probability is calculated for the interaction pair (i,j).

  2. Analysis of TIP49 and DUB datasets.
    Figure 2: Analysis of TIP49 and DUB datasets.

    (a) Benchmarking of filtered interactions in the TIP49 dataset by the overlap with interactions previously reported in BioGRID and iRefWeb databases. (b) Co-annotation of interaction partners to common GO terms in 'biological processes' in the TIP49 dataset. (c) Benchmarking against BioGRID and iRefWeb in the DUB dataset. (d) Co-annotation to GO terms in the DUB dataset.

References

  1. Ewing, R.M. et al. Mol. Syst. Biol. 3, 89 (2007).
  2. Gavin, A.C. et al. Nature 440, 631636 (2006).
  3. Jeronimo, C. et al. Mol. Cell 27, 262274 (2007).
  4. Krogan, N.J. et al. Nature 440, 637643 (2006).
  5. Nesvizhskii, A.I., Vitek, O. & Aebersold, R. Nat. Methods 4, 787797 (2007).
  6. Sardiu, M.E. et al. Proc. Natl. Acad. Sci. USA 105, 14541459 (2008).
  7. Sowa, M.E., Bennett, E.J., Gygi, S.P. & Harper, J.W. Cell 138, 389403 (2009).
  8. Breitkreutz, A. et al. Science 328, 10431046 (2010).
  9. Müller, P., Parmigiani, G & Rice, K. in Bayesian Statistics Vol. 8 (eds., Bernardo, J.M. et al.) 349–370 (Oxford University Press, 2007).
  10. Behrends, C., Sowa, M.E., Gygi, S.P. & Harper, J.W. Nature 466, 6876 (2010).
  11. Breitkreutz, B.J. et al. Nucleic Acids Res. 36, D637D640 (2008).
  12. Turner, B. et al. Database (Oxford) 2010, baq023 (2010).
  13. Hubner, N.C. et al. J. Cell Biol. 189, 739754 (2010).
  14. Rinner, O. et al. Nat. Biotechnol. 25, 345352 (2007).
  15. Griffin, N.M. et al. Nat. Biotechnol. 28, 8389 (2010).
  16. Eng, J.K., McCormack, A.L. & Yates, J.R.I. J. Am. Soc. Mass Spectrom. 5, 976989 (1994).
  17. Ishwaran, H. & James, L.F. J. Am. Stat. Assoc. 96, 161173 (2001).

Download references

Author information

Affiliations

  1. Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA.

    • Hyungwon Choi,
    • Dattatreya Mellacheruvu,
    • Damian Fermin &
    • Alexey I Nesvizhskii
  2. Centre for Systems Biology, Samuel Lunenfeld Research Institute, Toronto, Ontario, Canada.

    • Brett Larsen,
    • Zhen-Yuan Lin,
    • Ashton Breitkreutz,
    • Mike Tyers &
    • Anne-Claude Gingras
  3. Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, USA.

    • Zhaohui S Qin
  4. Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.

    • Mike Tyers &
    • Anne-Claude Gingras
  5. Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK.

    • Mike Tyers
  6. Centre for Systems Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK.

    • Mike Tyers
  7. Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.

    • Alexey I Nesvizhskii
  8. Present address: Department of Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia, USA.

    • Zhaohui S Qin

Contributions

H.C. and A.I.N. developed, implemented and tested the SAINT method; H.C. wrote the software; B.L., A.B., Z.-Y.L., A.-C.G. and M.T. generated data for the initial SAINT modeling and provided feedback on the model performance; D.M. and D.F. assisted with data analysis and processing; Z.S.Q. contributed to statistical model development; H.C., A.-C.G. and A.I.N. wrote the manuscript; A.I.N. and A.-C.G. conceived the study; A.I.N. directed the project with input from A.-C.G.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Author details

Supplementary information

PDF files

  1. Supplementary Text and Figures (240 KB)

    Supplementary Figure 1

Excel files

  1. Supplementary Table 1 (1008 KB)

    Data for the TIP49 dataset. (a) List of all detected interactions and scores from PP-NSAF, CompPASS and SAINT. (b) All interactions in control purifications were included in a separate table after merging of 35 technical replicate purifications into 9 purifications. (c) Table of technical replicates of control purifications. (d) GO terms enrichment in top scoring interactions for each scoring method.

  2. Supplementary Table 2 (3 MB)

    Data for the DUB dataset. (a) List of all detected interactions and scores from CompPASS and SAINT. (bd) GO terms enrichment in top scoring interactions for each scoring method.

  3. Supplementary Table 3 (100 KB)

    Data for the CDC23 dataset. List of all detected interactions with SAINT scores and results reported by t-test.

Zip files

  1. Supplementary Software (2 MB)

Additional data