Efforts to map the human protein interactome have resulted in information about thousands of multi-protein assemblies housed in public repositories, but the molecular characterization and stoichiometry of their protein subunits remains largely unknown. Here, we report a computational search strategy that supports hierarchical top-down analysis for precise identification and scoring of multi-proteoform complexes by native mass spectrometry.
Subscribe to Journal
Get full journal access for 1 year
only $20.17 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Babu, M. et al. Nature 489, 585–589 (2012).
Butland, G. et al. Nature 433, 531–537 (2005).
Gavin, A.C. et al. Nature 440, 631–636 (2006).
Krogan, N.J. et al. Nature 440, 637–643 (2006).
Kühner, S. et al. Science 326, 1235–1240 (2009).
Guruharsha, K.G. et al. Cell 147, 690–703 (2011).
Havugimana, P.C. et al. Cell 150, 1068–1081 (2012).
Kristensen, A.R., Gsponer, J. & Foster, L.J. Nat. Methods 9, 907–909 (2012).
Malovannaya, A. et al. Cell 145, 787–799 (2011).
Tran, J.C. et al. Nature 480, 254–258 (2011).
Smith, L.M. & Kelleher, N.L. Nat. Methods 10, 186–187 (2013).
Skinner, O.S. et al. Anal. Chem. 87, 3032–3038 (2015).
Belov, M.E. et al. Anal. Chem. 85, 11163–11173 (2013).
Rathore, D. & Dodds, E.D. J. Am. Soc. Mass Spectrom. 25, 1600–1609 (2014).
Dyachenko, A. et al. Anal. Chem. 87, 6095–6102 (2015).
Ruepp, A. et al. Nucleic Acids Res. 38, D497–D501 (2010).
Pesavento, J.J., Kim, Y.B., Taylor, G.K. & Kelleher, N.L. J. Am. Chem. Soc. 126, 3386–3387 (2004).
Meng, F. et al. Nat. Biotechnol. 19, 952–957 (2001).
Chick, J.M. et al. Nat. Biotechnol. 33, 743–749 (2015).
Blackwell, A.E., Dodds, E.D., Bandarian, V. & Wysocki, V.H. Anal. Chem. 83, 2862–2865 (2011).
Theil, E.C. Curr. Opin. Chem. Biol. 15, 304–311 (2011).
Zhou, M., Dagan, S. & Wysocki, V.H. Angew. Chem. Int. Edn Engl. 51, 4336–4339 (2012).
Loo, J.A., Benchaar, S.A. & Zhang, J. Mass Spectrom. (Tokyo) 2, S0013 (2013).
Nelp, M.T., Astashkin, A.V., Breci, L.A., McCarty, R.M. & Bandarian, V. Biochemistry 53, 3990–3994 (2014).
Havugimana, P.C., Wong, P. & Emili, A. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 847, 54–61 (2007).
Zabrouskov, V., Senko, M.W., Du, Y., Leduc, R.D. & Kelleher, N.L. J. Am. Soc. Mass Spectrom. 16, 2027–2038 (2005).
Strohalm, M., Kavan, D., Novák, P., Volný, M. & Havlícek, V. Anal. Chem. 82, 4648–4651 (2010).
LeDuc, R.D. et al. Nucleic Acids Res. 32, W340–W345 (2004).
Fellers, R.T. et al. Proteomics 15, 1235–1238 (2015).
LeDuc, R.D. et al. J. Proteome Res. 13, 3231–3240 (2014).
The authors thank members of the Kelleher research group and V. Wysocki for helpful discussions and advice. O.S.S. is supported by a US National Science Foundation Graduate Research Fellowship (2014171659). P.C.H. is a recipient of a Northwestern University Chemistry of Life Processes Institute Postdoctoral Fellowship Award. L.H.F.D.V. is supported under CNPq research grant 202011/2012-7 from the Brazilian government. H.S.S. is supported under the Science Without Borders scholarship 88888.075416/2013-00 from the Coordination for the Improvement of Higher Education Personnel under the Brazilian government. This work was supported by grants from the W.M. Keck Foundation (DT061512) and the US National Institutes of Health (GM067193) to N.L.K.
M.E.B., S.R.H. and A.A.M. are employees of Thermo Fisher Scientific, and R.D.L., N.L.K., R.T.F. and J.B.G. help commercialize ProSightPC.
Integrated supplementary information
Assessment of the combinatorial expansion of human 20S proteasome MPCs considering only Swiss-Prot annotated isoforms of its 14 subunits without PTMs. Assuming one copy of each subunit, the total number of possible combinations of subunit isoforms is 144.
To search for candidate multi-proteoform complexes (MPCs) present in the sample, the user provides collected mass spectrometric data including the neutral masses of intact complex (MS1), ejected monomer (MS2) and backbone fragments ions (MS3) as well as the preferred error tolerance in the boxes provided. One additional custom complex can be added via the new complex feature (input as UniProt accession numbers) and will be included in the search. The search will return all MPCs directly associated with the ejected monomer, ranked according to their MPC-scores. Candidates with MPC-score below 0.0001 will not be displayed. The search results can be exported to an Excel spreadsheet for further processing. In this PK example, we used 500 Da; 100 Da; and 10 ppm error-tolerance for MS1, MS2 and MS3 respectively.
The 3-tier top-down MS/MS for this sample is illustrated with (a) the MS1 data; (b) the MS2 spectrum and (c) the graphical fragment ion map of the ejected monomer that matches to P11974-1, N-terminally acetylated (red), with a known S401A sequence variant marked in blue (E-value of 2 × 10−51, C-score of 61). The automated search result that combines MS1, MS2, and MS3 information is shown in (d) and produced a near-maximal MPC-score of 2,983 for the homotetrameric species. See Supplementary Fig. 4 for characterization of additional MPCs.
(a) Ten peaks of different masses in the MS1, here focused on the 35+ species for simplicity, could be assigned to 13 unique MPCs. (b) Monomer ejection from the isolated lower-mass cluster of MPCs indicated a total of four unique ejected monomer species; monomer ejection of the higher-mass cluster of MPCs is shown in Supplementary Fig. 3. (c) Fragmentation of the isolated modified form localized the mass addition to Cys165, consistent with its modification by β-mercaptoethanol. (d) Fragmentation of the lower mass, unmodified ejected monomer allowed for its identification as the N-terminally acetylated, endogenously cleaved [23–531] proteoform. The modified, truncated form was present at insufficient abundance for fragmentation. The three characterized forms (third form fragment map shown in Supplementary Fig. 3c) were all observed as isoform M1, N-terminally acetylated (red), and contained the S401A mutation (highlighted in blue).
Supplementary Figure 5 Detailed analysis of an unknown protein complex detected in the pyruvate kinase protein standard.
The data and fragment map from the 3-tier tandem MS procedure for the detected complex is shown in panels (a-c), followed by the graphical fragment ion map of the ejected monomer that corresponds to G1U7S4, the uncharacterized protein phosphoglycerate mutase 2, PGAM2, N-terminally acetylated (red), and identified with an E-value of 6 × 10−12 and a C-score of 45. In the MS2 spectrum, a set of satellite peaks ~128 Da lower in mass are consistent with a C-terminal lysine cleavage; the 9+ species is shown in the inset (b). After identification of this complex as PGAM2, addition of the uncleaved monomer to the new CORUM/Proteoform and CORUM/MPC hybrid databases led to the result with a MPC-score of 2,983 shown in (d) which emerged from the Fig. 1 process.
Supplementary Figure 6 Analysis of the endogenous human GAPDH complex, fractionated from HeLa S3 cells.
The 3-tier tandem mass spectrometric data for the detected complex is illustrated with the (a) MS1, (b) MS2 (multiple forms shown in inset) and (c) graphical fragment ion map of the ejected monomer that corresponds to P04406-1, the canonical GAPDH isoform (E-value of 1 × 10−23, C-score of 3). The modification to Cys152 was localized as discussed in the supplementary discussion and is highlighted in orange. (d) The homotetrameric MPC identified from a database search combining MS1, the singly-modified MS2, and MS3 information that was assigned a very high MPC-score of 2,924.
Supplementary Figure 7 Analysis of the ferritin complex from horse spleen yielding the correct subunit stoichiometry and detailed proteoform characterization.
The 3-tier tandem mass spectrometric data for the detected complex is illustrated with the (a) MS1, (b) MS2 (one ejected subunit observed) and (c) graphical fragment ion map of the ejected monomer that corresponds to P02791, the ferritin L-chain with an N-terminal acetylation and a previously unannotated cysteine methyl disulfide modification at Cys49 (E-value of 4 × 10−91, C-score of 178). (d) The 24-mer ferritin MPC with the correct subunit stoichiometry (i.e., 15x L-chain, 9x H-chain) was the top hit with a good MPC-score of 74 from a search combining MS1, MS2, and MS3 information.
The 3-tier tandem mass spectrometric data for the proteasome is illustrated with (a) MS1, (b) MS2 data and (c) the graphical fragment maps of four ejected monomer corresponding to subunits beta-6, alpha-4, alpha-5 (co-isolated with alpha-4), and alpha-7; E-values and C-scores from step 2 of the search framework are shown. (d) The highest MPC-scores associated with searches of each subunit. The isoform information and stoichiometry of these results is shown in Supplementary Table 4.
About this article
Cite this article
Skinner, O., Havugimana, P., Haverland, N. et al. An informatic framework for decoding protein complexes by top-down mass spectrometry. Nat Methods 13, 237–240 (2016) doi:10.1038/nmeth.3731
Structural mass spectrometry comes of age: new insight into protein structure, function and interactions
Biochemical Society Transactions (2019)
Degronomics: Mapping the Interacting Peptidome of a Ubiquitin Ligase Using an Integrative Mass Spectrometry Strategy
Analytical Chemistry (2019)
Native top-down mass spectrometry provides insights into the copper centers of membrane-bound methane monooxygenase
Nature Communications (2019)
Surface-Induced Dissociation of Noncovalent Protein Complexes in an Extended Mass Range Orbitrap Mass Spectrometer
Analytical Chemistry (2019)