An informatic framework for decoding protein complexes by top-down mass spectrometry

Article metrics

Abstract

Efforts to map the human protein interactome have resulted in information about thousands of multi-protein assemblies housed in public repositories, but the molecular characterization and stoichiometry of their protein subunits remains largely unknown. Here, we report a computational search strategy that supports hierarchical top-down analysis for precise identification and scoring of multi-proteoform complexes by native mass spectrometry.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Computational platform and workflow for the characterization of human multi-proteoform complexes (MPCs).
Figure 2: Characterization of toyocamycin nitrile hydratase (TNH), a hexameric multi-proteoform complex purified to homogeneity.
Figure 3: A summary of the MPCs identified in this study.

References

  1. 1

    Babu, M. et al. Nature 489, 585–589 (2012).

  2. 2

    Butland, G. et al. Nature 433, 531–537 (2005).

  3. 3

    Gavin, A.C. et al. Nature 440, 631–636 (2006).

  4. 4

    Krogan, N.J. et al. Nature 440, 637–643 (2006).

  5. 5

    Kühner, S. et al. Science 326, 1235–1240 (2009).

  6. 6

    Guruharsha, K.G. et al. Cell 147, 690–703 (2011).

  7. 7

    Havugimana, P.C. et al. Cell 150, 1068–1081 (2012).

  8. 8

    Kristensen, A.R., Gsponer, J. & Foster, L.J. Nat. Methods 9, 907–909 (2012).

  9. 9

    Malovannaya, A. et al. Cell 145, 787–799 (2011).

  10. 10

    Tran, J.C. et al. Nature 480, 254–258 (2011).

  11. 11

    Smith, L.M. & Kelleher, N.L. Nat. Methods 10, 186–187 (2013).

  12. 12

    Skinner, O.S. et al. Anal. Chem. 87, 3032–3038 (2015).

  13. 13

    Belov, M.E. et al. Anal. Chem. 85, 11163–11173 (2013).

  14. 14

    Rathore, D. & Dodds, E.D. J. Am. Soc. Mass Spectrom. 25, 1600–1609 (2014).

  15. 15

    Dyachenko, A. et al. Anal. Chem. 87, 6095–6102 (2015).

  16. 16

    Ruepp, A. et al. Nucleic Acids Res. 38, D497–D501 (2010).

  17. 17

    Pesavento, J.J., Kim, Y.B., Taylor, G.K. & Kelleher, N.L. J. Am. Chem. Soc. 126, 3386–3387 (2004).

  18. 18

    Meng, F. et al. Nat. Biotechnol. 19, 952–957 (2001).

  19. 19

    Chick, J.M. et al. Nat. Biotechnol. 33, 743–749 (2015).

  20. 20

    Blackwell, A.E., Dodds, E.D., Bandarian, V. & Wysocki, V.H. Anal. Chem. 83, 2862–2865 (2011).

  21. 21

    Theil, E.C. Curr. Opin. Chem. Biol. 15, 304–311 (2011).

  22. 22

    Zhou, M., Dagan, S. & Wysocki, V.H. Angew. Chem. Int. Edn Engl. 51, 4336–4339 (2012).

  23. 23

    Loo, J.A., Benchaar, S.A. & Zhang, J. Mass Spectrom. (Tokyo) 2, S0013 (2013).

  24. 24

    Nelp, M.T., Astashkin, A.V., Breci, L.A., McCarty, R.M. & Bandarian, V. Biochemistry 53, 3990–3994 (2014).

  25. 25

    Havugimana, P.C., Wong, P. & Emili, A. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 847, 54–61 (2007).

  26. 26

    Zabrouskov, V., Senko, M.W., Du, Y., Leduc, R.D. & Kelleher, N.L. J. Am. Soc. Mass Spectrom. 16, 2027–2038 (2005).

  27. 27

    Strohalm, M., Kavan, D., Novák, P., Volný, M. & Havlícek, V. Anal. Chem. 82, 4648–4651 (2010).

  28. 28

    LeDuc, R.D. et al. Nucleic Acids Res. 32, W340–W345 (2004).

  29. 29

    Fellers, R.T. et al. Proteomics 15, 1235–1238 (2015).

  30. 30

    LeDuc, R.D. et al. J. Proteome Res. 13, 3231–3240 (2014).

Download references

Acknowledgements

The authors thank members of the Kelleher research group and V. Wysocki for helpful discussions and advice. O.S.S. is supported by a US National Science Foundation Graduate Research Fellowship (2014171659). P.C.H. is a recipient of a Northwestern University Chemistry of Life Processes Institute Postdoctoral Fellowship Award. L.H.F.D.V. is supported under CNPq research grant 202011/2012-7 from the Brazilian government. H.S.S. is supported under the Science Without Borders scholarship 88888.075416/2013-00 from the Coordination for the Improvement of Higher Education Personnel under the Brazilian government. This work was supported by grants from the W.M. Keck Foundation (DT061512) and the US National Institutes of Health (GM067193) to N.L.K.

Author information

P.C.H. and N.L.K. devised the computational approach with contributions from O.S.S., R.D.L. and R.T.F. O.S.S. interpreted the data with help from P.C.H. and P.D.C. O.S.S. and P.D.C. collected native MS data. Further supporting results were collected by L.F., K.R.D., L.H.F.D.V. and R.D.M. R.D.L. developed the MPC score. B.P.E., J.B.G. and R.T.F. implemented the search engine and scoring system. M.E.B., S.R.H. and A.A.M. participated in the design of the instrument and support of initial experiments. M.T.N. and V.B. provided the TNH complex. O.S.S., P.C.H. and N.L.K. co-wrote the manuscript with critical inputs from N.A.H. and H.S.S. and contributions from all coauthors. N.L.K. initiated and supervised the project.

Correspondence to Neil L Kelleher.

Ethics declarations

Competing interests

M.E.B., S.R.H. and A.A.M. are employees of Thermo Fisher Scientific, and R.D.L., N.L.K., R.T.F. and J.B.G. help commercialize ProSightPC.

Integrated supplementary information

Supplementary Figure 1 The 144 isoform combinations of human 20S proteasome.

Assessment of the combinatorial expansion of human 20S proteasome MPCs considering only Swiss-Prot annotated isoforms of its 14 subunits without PTMs. Assuming one copy of each subunit, the total number of possible combinations of subunit isoforms is 144.

Supplementary Figure 2 Screenshot of the input for pyruvate kinase on the web-based search tool.

To search for candidate multi-proteoform complexes (MPCs) present in the sample, the user provides collected mass spectrometric data including the neutral masses of intact complex (MS1), ejected monomer (MS2) and backbone fragments ions (MS3) as well as the preferred error tolerance in the boxes provided. One additional custom complex can be added via the new complex feature (input as UniProt accession numbers) and will be included in the search. The search will return all MPCs directly associated with the ejected monomer, ranked according to their MPC-scores. Candidates with MPC-score below 0.0001 will not be displayed. The search results can be exported to an Excel spreadsheet for further processing. In this PK example, we used 500 Da; 100 Da; and 10 ppm error-tolerance for MS1, MS2 and MS3 respectively.

Supplementary Figure 3 Characterization of a purified pyruvate kinase MPC.

The 3-tier top-down MS/MS for this sample is illustrated with (a) the MS1 data; (b) the MS2 spectrum and (c) the graphical fragment ion map of the ejected monomer that matches to P11974-1, N-terminally acetylated (red), with a known S401A sequence variant marked in blue (E-value of 2 × 10−51, C-score of 61). The automated search result that combines MS1, MS2, and MS3 information is shown in (d) and produced a near-maximal MPC-score of 2,983 for the homotetrameric species. See Supplementary Fig. 4 for characterization of additional MPCs.

Supplementary Figure 4 The 13 characterized MPCs of pyruvate kinase.

(a) Ten peaks of different masses in the MS1, here focused on the 35+ species for simplicity, could be assigned to 13 unique MPCs. (b) Monomer ejection from the isolated lower-mass cluster of MPCs indicated a total of four unique ejected monomer species; monomer ejection of the higher-mass cluster of MPCs is shown in Supplementary Fig. 3. (c) Fragmentation of the isolated modified form localized the mass addition to Cys165, consistent with its modification by β-mercaptoethanol. (d) Fragmentation of the lower mass, unmodified ejected monomer allowed for its identification as the N-terminally acetylated, endogenously cleaved [23–531] proteoform. The modified, truncated form was present at insufficient abundance for fragmentation. The three characterized forms (third form fragment map shown in Supplementary Fig. 3c) were all observed as isoform M1, N-terminally acetylated (red), and contained the S401A mutation (highlighted in blue).

Supplementary Figure 5 Detailed analysis of an unknown protein complex detected in the pyruvate kinase protein standard.

The data and fragment map from the 3-tier tandem MS procedure for the detected complex is shown in panels (a-c), followed by the graphical fragment ion map of the ejected monomer that corresponds to G1U7S4, the uncharacterized protein phosphoglycerate mutase 2, PGAM2, N-terminally acetylated (red), and identified with an E-value of 6 × 10−12 and a C-score of 45. In the MS2 spectrum, a set of satellite peaks ~128 Da lower in mass are consistent with a C-terminal lysine cleavage; the 9+ species is shown in the inset (b). After identification of this complex as PGAM2, addition of the uncleaved monomer to the new CORUM/Proteoform and CORUM/MPC hybrid databases led to the result with a MPC-score of 2,983 shown in (d) which emerged from the Fig. 1 process.

Supplementary Figure 6 Analysis of the endogenous human GAPDH complex, fractionated from HeLa S3 cells.

The 3-tier tandem mass spectrometric data for the detected complex is illustrated with the (a) MS1, (b) MS2 (multiple forms shown in inset) and (c) graphical fragment ion map of the ejected monomer that corresponds to P04406-1, the canonical GAPDH isoform (E-value of 1 × 10−23, C-score of 3). The modification to Cys152 was localized as discussed in the supplementary discussion and is highlighted in orange. (d) The homotetrameric MPC identified from a database search combining MS1, the singly-modified MS2, and MS3 information that was assigned a very high MPC-score of 2,924.

Supplementary Figure 7 Analysis of the ferritin complex from horse spleen yielding the correct subunit stoichiometry and detailed proteoform characterization.

The 3-tier tandem mass spectrometric data for the detected complex is illustrated with the (a) MS1, (b) MS2 (one ejected subunit observed) and (c) graphical fragment ion map of the ejected monomer that corresponds to P02791, the ferritin L-chain with an N-terminal acetylation and a previously unannotated cysteine methyl disulfide modification at Cys49 (E-value of 4 × 10−91, C-score of 178). (d) The 24-mer ferritin MPC with the correct subunit stoichiometry (i.e., 15x L-chain, 9x H-chain) was the top hit with a good MPC-score of 74 from a search combining MS1, MS2, and MS3 information.

Supplementary Figure 8 Analysis of human 20S proteasome complex.

The 3-tier tandem mass spectrometric data for the proteasome is illustrated with (a) MS1, (b) MS2 data and (c) the graphical fragment maps of four ejected monomer corresponding to subunits beta-6, alpha-4, alpha-5 (co-isolated with alpha-4), and alpha-7; E-values and C-scores from step 2 of the search framework are shown. (d) The highest MPC-scores associated with searches of each subunit. The isoform information and stoichiometry of these results is shown in Supplementary Table 4.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8, Supplementary Discussion and Supplementary Tables 1–4. (PDF 2633 kb)

Supplementary Table 5

Neutral monoisotopic masses for use with the web tool (XLSX 45 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Skinner, O., Havugimana, P., Haverland, N. et al. An informatic framework for decoding protein complexes by top-down mass spectrometry. Nat Methods 13, 237–240 (2016) doi:10.1038/nmeth.3731

Download citation

Further reading