An informatic framework for decoding protein complexes by top-down mass spectrometry

Skinner, Owen S; Havugimana, Pierre C; Haverland, Nicole A; Fornelli, Luca; Early, Bryan P; Greer, Joseph B; Fellers, Ryan T; Durbin, Kenneth R; Do Vale, Luis H F; Melani, Rafael D; Seckler, Henrique S; Nelp, Micah T; Belov, Mikhail E; Horning, Stevan R; Makarov, Alexander A; LeDuc, Richard D; Bandarian, Vahe; Compton, Philip D; Kelleher, Neil L

doi:10.1038/nmeth.3731

Brief Communication
Published: 18 January 2016

An informatic framework for decoding protein complexes by top-down mass spectrometry

Owen S Skinner¹^na1,
Pierre C Havugimana^2,3^na1,
Nicole A Haverland¹,
Luca Fornelli^1,2,3,
Bryan P Early³,
Joseph B Greer³,
Ryan T Fellers³,
Kenneth R Durbin⁴,
Luis H F Do Vale^3,5,
Rafael D Melani¹,
Henrique S Seckler¹,
Micah T Nelp⁶,
Mikhail E Belov⁷,
Stevan R Horning⁷,
Alexander A Makarov⁷,
Richard D LeDuc²,
Vahe Bandarian⁶,
Philip D Compton^1,3 &
…
Neil L Kelleher^1,2,3,4

Nature Methods volume 13, pages 237–240 (2016)Cite this article

5185 Accesses
51 Citations
6 Altmetric
Metrics details

Subjects

Abstract

Efforts to map the human protein interactome have resulted in information about thousands of multi-protein assemblies housed in public repositories, but the molecular characterization and stoichiometry of their protein subunits remains largely unknown. Here, we report a computational search strategy that supports hierarchical top-down analysis for precise identification and scoring of multi-proteoform complexes by native mass spectrometry.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Computational platform and workflow for the characterization of human multi-proteoform complexes (MPCs).**

**Figure 2: Characterization of toyocamycin nitrile hydratase (TNH), a hexameric multi-proteoform complex purified to homogeneity.**

**Figure 3: A summary of the MPCs identified in this study.**

SPECTRUM – A MATLAB Toolbox for Proteoform Identification from Top-Down Proteomics Data

Article Open access 02 August 2019

A synthetic peptide library for benchmarking crosslinking-mass spectrometry search engines for proteins and protein complexes

Article Open access 06 February 2020

A high-speed search engine pLink 2 with systematic evaluation for proteome-scale identification of cross-linked peptides

Article Open access 30 July 2019

References

Babu, M. et al. Nature 489, 585–589 (2012).
Article CAS Google Scholar
Butland, G. et al. Nature 433, 531–537 (2005).
Article CAS Google Scholar
Gavin, A.C. et al. Nature 440, 631–636 (2006).
Article CAS Google Scholar
Krogan, N.J. et al. Nature 440, 637–643 (2006).
Article CAS Google Scholar
Kühner, S. et al. Science 326, 1235–1240 (2009).
Article Google Scholar
Guruharsha, K.G. et al. Cell 147, 690–703 (2011).
Article CAS Google Scholar
Havugimana, P.C. et al. Cell 150, 1068–1081 (2012).
Article CAS Google Scholar
Kristensen, A.R., Gsponer, J. & Foster, L.J. Nat. Methods 9, 907–909 (2012).
Article CAS Google Scholar
Malovannaya, A. et al. Cell 145, 787–799 (2011).
Article CAS Google Scholar
Tran, J.C. et al. Nature 480, 254–258 (2011).
Article CAS Google Scholar
Smith, L.M. & Kelleher, N.L. Nat. Methods 10, 186–187 (2013).
Article CAS Google Scholar
Skinner, O.S. et al. Anal. Chem. 87, 3032–3038 (2015).
Article CAS Google Scholar
Belov, M.E. et al. Anal. Chem. 85, 11163–11173 (2013).
Article CAS Google Scholar
Rathore, D. & Dodds, E.D. J. Am. Soc. Mass Spectrom. 25, 1600–1609 (2014).
Article CAS Google Scholar
Dyachenko, A. et al. Anal. Chem. 87, 6095–6102 (2015).
Article CAS Google Scholar
Ruepp, A. et al. Nucleic Acids Res. 38, D497–D501 (2010).
Article CAS Google Scholar
Pesavento, J.J., Kim, Y.B., Taylor, G.K. & Kelleher, N.L. J. Am. Chem. Soc. 126, 3386–3387 (2004).
Article CAS Google Scholar
Meng, F. et al. Nat. Biotechnol. 19, 952–957 (2001).
Article CAS Google Scholar
Chick, J.M. et al. Nat. Biotechnol. 33, 743–749 (2015).
Article CAS Google Scholar
Blackwell, A.E., Dodds, E.D., Bandarian, V. & Wysocki, V.H. Anal. Chem. 83, 2862–2865 (2011).
Article CAS Google Scholar
Theil, E.C. Curr. Opin. Chem. Biol. 15, 304–311 (2011).
Article CAS Google Scholar
Zhou, M., Dagan, S. & Wysocki, V.H. Angew. Chem. Int. Edn Engl. 51, 4336–4339 (2012).
Article CAS Google Scholar
Loo, J.A., Benchaar, S.A. & Zhang, J. Mass Spectrom. (Tokyo) 2, S0013 (2013).
Google Scholar
Nelp, M.T., Astashkin, A.V., Breci, L.A., McCarty, R.M. & Bandarian, V. Biochemistry 53, 3990–3994 (2014).
Article CAS Google Scholar
Havugimana, P.C., Wong, P. & Emili, A. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 847, 54–61 (2007).
Article CAS Google Scholar
Zabrouskov, V., Senko, M.W., Du, Y., Leduc, R.D. & Kelleher, N.L. J. Am. Soc. Mass Spectrom. 16, 2027–2038 (2005).
Article CAS Google Scholar
Strohalm, M., Kavan, D., Novák, P., Volný, M. & Havlícek, V. Anal. Chem. 82, 4648–4651 (2010).
Article CAS Google Scholar
LeDuc, R.D. et al. Nucleic Acids Res. 32, W340–W345 (2004).
Article CAS Google Scholar
Fellers, R.T. et al. Proteomics 15, 1235–1238 (2015).
Article CAS Google Scholar
LeDuc, R.D. et al. J. Proteome Res. 13, 3231–3240 (2014).
Article CAS Google Scholar

Download references

Acknowledgements

The authors thank members of the Kelleher research group and V. Wysocki for helpful discussions and advice. O.S.S. is supported by a US National Science Foundation Graduate Research Fellowship (2014171659). P.C.H. is a recipient of a Northwestern University Chemistry of Life Processes Institute Postdoctoral Fellowship Award. L.H.F.D.V. is supported under CNPq research grant 202011/2012-7 from the Brazilian government. H.S.S. is supported under the Science Without Borders scholarship 88888.075416/2013-00 from the Coordination for the Improvement of Higher Education Personnel under the Brazilian government. This work was supported by grants from the W.M. Keck Foundation (DT061512) and the US National Institutes of Health (GM067193) to N.L.K.

Author information

Owen S Skinner and Pierre C Havugimana: These authors contributed equally to this work.

Authors and Affiliations

Department of Chemistry, Northwestern University, Evanston, Illinois, USA
Owen S Skinner, Nicole A Haverland, Luca Fornelli, Rafael D Melani, Henrique S Seckler, Philip D Compton & Neil L Kelleher
Chemistry of Life Processes Institute, Northwestern University, Evanston, Illinois, USA
Pierre C Havugimana, Luca Fornelli, Richard D LeDuc & Neil L Kelleher
Proteomics Center of Excellence, Northwestern University, Evanston, Illinois, USA
Pierre C Havugimana, Luca Fornelli, Bryan P Early, Joseph B Greer, Ryan T Fellers, Luis H F Do Vale, Philip D Compton & Neil L Kelleher
Department of Molecular Biosciences, Northwestern University, Evanston, Illinois, USA
Kenneth R Durbin & Neil L Kelleher
Brazilian Center for Protein Research, University of Brasilia, Brasilia, Federal District, Brazil
Luis H F Do Vale
Department of Chemistry and Biochemistry, University of Arizona, Tucson, Arizona, USA
Micah T Nelp & Vahe Bandarian
Thermo Fisher Scientific (Bremen) GmbH, Bremen, Germany
Mikhail E Belov, Stevan R Horning & Alexander A Makarov

Authors

Owen S Skinner
View author publications
You can also search for this author in PubMed Google Scholar
Pierre C Havugimana
View author publications
You can also search for this author in PubMed Google Scholar
Nicole A Haverland
View author publications
You can also search for this author in PubMed Google Scholar
Luca Fornelli
View author publications
You can also search for this author in PubMed Google Scholar
Bryan P Early
View author publications
You can also search for this author in PubMed Google Scholar
Joseph B Greer
View author publications
You can also search for this author in PubMed Google Scholar
Ryan T Fellers
View author publications
You can also search for this author in PubMed Google Scholar
Kenneth R Durbin
View author publications
You can also search for this author in PubMed Google Scholar
Luis H F Do Vale
View author publications
You can also search for this author in PubMed Google Scholar
Rafael D Melani
View author publications
You can also search for this author in PubMed Google Scholar
Henrique S Seckler
View author publications
You can also search for this author in PubMed Google Scholar
Micah T Nelp
View author publications
You can also search for this author in PubMed Google Scholar
Mikhail E Belov
View author publications
You can also search for this author in PubMed Google Scholar
Stevan R Horning
View author publications
You can also search for this author in PubMed Google Scholar
Alexander A Makarov
View author publications
You can also search for this author in PubMed Google Scholar
Richard D LeDuc
View author publications
You can also search for this author in PubMed Google Scholar
Vahe Bandarian
View author publications
You can also search for this author in PubMed Google Scholar
Philip D Compton
View author publications
You can also search for this author in PubMed Google Scholar
Neil L Kelleher
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P.C.H. and N.L.K. devised the computational approach with contributions from O.S.S., R.D.L. and R.T.F. O.S.S. interpreted the data with help from P.C.H. and P.D.C. O.S.S. and P.D.C. collected native MS data. Further supporting results were collected by L.F., K.R.D., L.H.F.D.V. and R.D.M. R.D.L. developed the MPC score. B.P.E., J.B.G. and R.T.F. implemented the search engine and scoring system. M.E.B., S.R.H. and A.A.M. participated in the design of the instrument and support of initial experiments. M.T.N. and V.B. provided the TNH complex. O.S.S., P.C.H. and N.L.K. co-wrote the manuscript with critical inputs from N.A.H. and H.S.S. and contributions from all coauthors. N.L.K. initiated and supervised the project.

Corresponding author

Correspondence to Neil L Kelleher.

Ethics declarations

Competing interests

M.E.B., S.R.H. and A.A.M. are employees of Thermo Fisher Scientific, and R.D.L., N.L.K., R.T.F. and J.B.G. help commercialize ProSightPC.

Integrated supplementary information

Supplementary Figure 1 The 144 isoform combinations of human 20S proteasome.

Assessment of the combinatorial expansion of human 20S proteasome MPCs considering only Swiss-Prot annotated isoforms of its 14 subunits without PTMs. Assuming one copy of each subunit, the total number of possible combinations of subunit isoforms is 144.

Supplementary Figure 2 Screenshot of the input for pyruvate kinase on the web-based search tool.

To search for candidate multi-proteoform complexes (MPCs) present in the sample, the user provides collected mass spectrometric data including the neutral masses of intact complex (MS¹), ejected monomer (MS²) and backbone fragments ions (MS³) as well as the preferred error tolerance in the boxes provided. One additional custom complex can be added via the new complex feature (input as UniProt accession numbers) and will be included in the search. The search will return all MPCs directly associated with the ejected monomer, ranked according to their MPC-scores. Candidates with MPC-score below 0.0001 will not be displayed. The search results can be exported to an Excel spreadsheet for further processing. In this PK example, we used 500 Da; 100 Da; and 10 ppm error-tolerance for MS¹, MS² and MS³ respectively.

Supplementary Figure 3 Characterization of a purified pyruvate kinase MPC.

The 3-tier top-down MS/MS for this sample is illustrated with (a) the MS¹ data; (b) the MS² spectrum and (c) the graphical fragment ion map of the ejected monomer that matches to P11974-1, N-terminally acetylated (red), with a known S401A sequence variant marked in blue (E-value of 2 × 10⁻⁵¹, C-score of 61). The automated search result that combines MS¹, MS², and MS³ information is shown in (d) and produced a near-maximal MPC-score of 2,983 for the homotetrameric species. See Supplementary Fig. 4 for characterization of additional MPCs.

Supplementary Figure 4 The 13 characterized MPCs of pyruvate kinase.

(a) Ten peaks of different masses in the MS¹, here focused on the 35+ species for simplicity, could be assigned to 13 unique MPCs. (b) Monomer ejection from the isolated lower-mass cluster of MPCs indicated a total of four unique ejected monomer species; monomer ejection of the higher-mass cluster of MPCs is shown in Supplementary Fig. 3. (c) Fragmentation of the isolated modified form localized the mass addition to Cys165, consistent with its modification by β-mercaptoethanol. (d) Fragmentation of the lower mass, unmodified ejected monomer allowed for its identification as the N-terminally acetylated, endogenously cleaved [23–531] proteoform. The modified, truncated form was present at insufficient abundance for fragmentation. The three characterized forms (third form fragment map shown in Supplementary Fig. 3c) were all observed as isoform M1, N-terminally acetylated (red), and contained the S401A mutation (highlighted in blue).

Supplementary Figure 5 Detailed analysis of an unknown protein complex detected in the pyruvate kinase protein standard.

The data and fragment map from the 3-tier tandem MS procedure for the detected complex is shown in panels (a-c), followed by the graphical fragment ion map of the ejected monomer that corresponds to G1U7S4, the uncharacterized protein phosphoglycerate mutase 2, PGAM2, N-terminally acetylated (red), and identified with an E-value of 6 × 10⁻¹² and a C-score of 45. In the MS² spectrum, a set of satellite peaks ~128 Da lower in mass are consistent with a C-terminal lysine cleavage; the 9+ species is shown in the inset (b). After identification of this complex as PGAM2, addition of the uncleaved monomer to the new CORUM/Proteoform and CORUM/MPC hybrid databases led to the result with a MPC-score of 2,983 shown in (d) which emerged from the Fig. 1 process.

Supplementary Figure 6 Analysis of the endogenous human GAPDH complex, fractionated from HeLa S3 cells.

The 3-tier tandem mass spectrometric data for the detected complex is illustrated with the (a) MS¹, (b) MS² (multiple forms shown in inset) and (c) graphical fragment ion map of the ejected monomer that corresponds to P04406-1, the canonical GAPDH isoform (E-value of 1 × 10⁻²³, C-score of 3). The modification to Cys152 was localized as discussed in the supplementary discussion and is highlighted in orange. (d) The homotetrameric MPC identified from a database search combining MS¹, the singly-modified MS², and MS³ information that was assigned a very high MPC-score of 2,924.

Supplementary Figure 7 Analysis of the ferritin complex from horse spleen yielding the correct subunit stoichiometry and detailed proteoform characterization.

The 3-tier tandem mass spectrometric data for the detected complex is illustrated with the (a) MS¹, (b) MS² (one ejected subunit observed) and (c) graphical fragment ion map of the ejected monomer that corresponds to P02791, the ferritin L-chain with an N-terminal acetylation and a previously unannotated cysteine methyl disulfide modification at Cys49 (E-value of 4 × 10⁻⁹¹, C-score of 178). (d) The 24-mer ferritin MPC with the correct subunit stoichiometry (i.e., 15x L-chain, 9x H-chain) was the top hit with a good MPC-score of 74 from a search combining MS¹, MS², and MS³ information.

Supplementary Figure 8 Analysis of human 20S proteasome complex.

The 3-tier tandem mass spectrometric data for the proteasome is illustrated with (a) MS¹, (b) MS² data and (c) the graphical fragment maps of four ejected monomer corresponding to subunits beta-6, alpha-4, alpha-5 (co-isolated with alpha-4), and alpha-7; E-values and C-scores from step 2 of the search framework are shown. (d) The highest MPC-scores associated with searches of each subunit. The isoform information and stoichiometry of these results is shown in Supplementary Table 4.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8, Supplementary Discussion and Supplementary Tables 1–4. (PDF 2633 kb)

Supplementary Table 5

Neutral monoisotopic masses for use with the web tool (XLSX 45 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Skinner, O., Havugimana, P., Haverland, N. et al. An informatic framework for decoding protein complexes by top-down mass spectrometry. Nat Methods 13, 237–240 (2016). https://doi.org/10.1038/nmeth.3731

Download citation

Received: 03 February 2015
Accepted: 30 November 2015
Published: 18 January 2016
Issue Date: March 2016
DOI: https://doi.org/10.1038/nmeth.3731