Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

SubCellBarCode: integrated workflow for robust spatial proteomics by mass spectrometry

Abstract

The molecular functions of a protein are defined by its inherent properties in relation to its environment and interaction network. Within a cell, this environment and network are defined by the subcellular location of the protein. Consequently, it is crucial to know the localization of a protein to fully understand its functions. Recently, we have developed a mass spectrometry– (MS) and bioinformatics-based pipeline to generate a proteome-wide resource for protein subcellular localization across multiple human cancer cell lines (www.subcellbarcode.org). Here, we present a detailed wet-lab protocol spanning from subcellular fractionation to MS-sample preparation and analysis. A key feature of this protocol is that it includes all generated cell fractions without discarding any material during the fractionation process. We also describe the subsequent quantitative MS-data analysis, machine learning–based classification, differential localization analysis and visualization of the output. For broad applicability, we evaluated the pipeline by using MS data generated by two different peptide pre-fractionation approaches, namely high-resolution isoelectric focusing and high-pH reverse-phase fractionation, as well as direct analysis without pre-fractionation by using long-gradient liquid chromatography-MS. Moreover, an R package covering the dry-lab part of the method was developed and made available through Bioconductor. The method is straightforward and robust, and the entire protocol, from cell harvest to classification output, can be performed within 1–2 weeks. The protocol enables accurate classification of proteins to 15 compartments and 4 neighborhoods, visualization of the output data and differential localization analysis including treatment-induced protein relocalization, condition-dependent localization or cell type–specific localization. The SubCellBarCode package is freely available at https://bioconductor.org/packages/devel/bioc/html/SubCellBarCode.html.

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Overview of SubCellBarCode method in three parts.
Fig. 2: Overview of the protocol with timing of different parts indicated.
Fig. 3: Subcellular fractionation and MS sample preparation.
Fig. 4: Comparison of different MS approaches.
Fig. 5: Classification marker protein evaluation in HeLa cell HiRIEF analysis.
Fig. 6: Classification output and MS strategy comparison.
Fig. 7: Protein co-localization and differential localization analysis.

Data availability

The MS proteomics data for the analysis of the HeLa cell line have been deposited to the ProteomeXchange Consortium via the jPOST partner repository with the dataset identifier PXD022533. The SubCellBarCode R package and manual are freely available through the Bioconductor repository (https://doi.org/10.18129/B9.bioc.SubCellBarCode). The MS proteomics data for the previous analysis of five different cell lines have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD006895. All described processed datasets can be easily accessed, explored and visualized at the SubCellBarCode portal (https://www.subcellbarcode.org). Source data are provided with this paper.

Code availability

The SubCellBarCode pipeline used in this protocol is available at https://github.com/TanerArslan/SubCellBarCode/ and can also be found in Bioconductor (https://bioconductor.org/packages/release/bioc/html/SubCellBarCode.html).

References

  1. Heald, R. & Cohen-Fix, O. Morphology and function of membrane-bound organelles. Curr. Opin. Cell Biol. 26, 79–86 (2014).

    CAS  PubMed  Article  Google Scholar 

  2. Bauer, N. C., Doetsch, P. W. & Corbett, A. H. Mechanisms regulating protein localization. Traffic 16, 1039–1061 (2015).

    CAS  PubMed  Article  Google Scholar 

  3. Wang, A. J., Han, Y., Jia, N., Chen, P. & Minden, M. D. NPM1c impedes CTCF functions through cytoplasmic mislocalization in acute myeloid leukemia. Leukemia 34, 1278–1290 (2020).

    CAS  PubMed  Article  Google Scholar 

  4. Dansen, T. B. & Burgering, B. M. Unravelling the tumor-suppressive functions of FOXO proteins. Trends Cell Biol. 18, 421–429 (2008).

    CAS  PubMed  Article  Google Scholar 

  5. Guardia, C. M., De Pace, R., Mattera, R. & Bonifacino, J. S. Neuronal functions of adaptor complexes involved in protein sorting. Curr. Opin. Neurobiol. 51, 103–110 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  6. De Matteis, M. A. & Luini, A. Mendelian disorders of membrane trafficking. N. Engl. J. Med. 365, 927–938 (2011).

    PubMed  Article  Google Scholar 

  7. Thul, P. J. et al. A subcellular map of the human proteome. Science 356, eaal3321 (2017).

    PubMed  Article  CAS  Google Scholar 

  8. Schnell, U., Dijk, F., Sjollema, K. A. & Giepmans, B. N. Immunolabeling artifacts and the need for live-cell imaging. Nat. Methods 9, 152–158 (2012).

    CAS  PubMed  Article  Google Scholar 

  9. Stadler, C. et al. Immunofluorescence and fluorescent-protein tagging show high correlation for protein localization in mammalian cells. Nat. Methods 10, 315–323 (2013).

    CAS  PubMed  Article  Google Scholar 

  10. Andersen, J. S. et al. Proteomic characterization of the human centrosome by protein correlation profiling. Nature 426, 570–574 (2003).

    CAS  PubMed  Article  Google Scholar 

  11. Foster, L. J. et al. A mammalian organelle map by protein correlation profiling. Cell 125, 187–199 (2006).

    CAS  PubMed  Article  Google Scholar 

  12. Liu, X., Salokas, K., Weldatsadik, R. G., Gawriyski, L. & Varjosalo, M. Combined proximity labeling and affinity purification-mass spectrometry workflow for mapping and visualizing protein interaction networks. Nat. Protoc. 15, 3182–3211 (2020).

    CAS  PubMed  Article  Google Scholar 

  13. Roux, K. J., Kim, D. I., Raida, M. & Burke, B. A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. J. Cell Biol. 196, 801–810 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  14. Gatto, L., Breckels, L. M., Wieczorek, S., Burger, T. & Lilley, K. S. Mass-spectrometry-based spatial proteomics data analysis using pRoloc and pRolocdata. Bioinformatics 30, 1322–1324 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  15. Christoforou, A. et al. A draft map of the mouse pluripotent stem cell spatial proteome. Nat. Commun. 7, 8992 (2016).

    PubMed  Article  CAS  Google Scholar 

  16. Itzhak, D. N., Tyanova, S., Cox, J. & Borner, G. H. Global, quantitative and dynamic mapping of protein subcellular localization. eLife 5, e16950 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  17. Itzhak, D. N. et al. A mass spectrometry-based approach for mapping protein subcellular localization reveals the spatial proteome of mouse primary neurons. Cell Rep. 20, 2706–2718 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  18. Geladaki, A. et al. Combining LOPIT with differential ultracentrifugation for high-resolution spatial proteomics. Nat. Commun. 10, 331 (2019).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  19. Orre, L. M. et al. SubCellBarCode: proteome-wide mapping of protein localization and relocalization. Mol. Cell 73, 166–182.e7 (2019).

    CAS  PubMed  Article  Google Scholar 

  20. Joshi, R. N. et al. TcellSubC: an atlas of the subcellular proteome of human T cells. Front. Immunol. 10, 2708 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  21. Stenström, L. et al. Mapping the nucleolar proteome reveals a spatiotemporal organization related to intrinsic protein disorder. Mol. Syst. Biol. 16, e9469 (2020).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  22. Herr, P. et al. Cell cycle profiling reveals protein oscillation, phosphorylation, and localization dynamics. Mol. Cell. Proteom. 19, 608–623 (2020).

    CAS  Article  Google Scholar 

  23. Moll, T., Tebb, G., Surana, U., Robitsch, H. & Nasmyth, K. The role of phosphorylation and the CDC28 protein kinase in cell cycle-regulated nuclear import of the S. cerevisiae transcription factor SWI5. Cell 66, 743–758 (1991).

    CAS  PubMed  Article  Google Scholar 

  24. Du, J. X., Bialkowska, A. B., McConnell, B. B. & Yang, V. W. SUMOylation regulates nuclear localization of Kruppel-like factor 5. J. Biol. Chem. 283, 31991–32002 (2008).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  25. Wang, M. & Casey, P. J. Protein prenylation: unique fats make their mark on biology. Nat. Rev. Mol. Cell Biol. 17, 110–122 (2016).

    CAS  PubMed  Article  Google Scholar 

  26. Mertins, P. et al. Reproducible workflow for multiplexed deep-scale proteome and phosphoproteome analysis of tumor tissues by liquid chromatography–mass spectrometry. Nat. Protoc. 13, 1632–1661 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. Christopher, J. A. et al. Subcellular proteomics. Nat. Rev. Methods Prim. 1, 32 (2021).

    CAS  Article  Google Scholar 

  28. Lundberg, E. & Borner, G. H. H. Spatial proteomics: a powerful discovery tool for cell biology. Nat. Rev. Mol. Cell Biol. 20, 285–302 (2019).

    CAS  PubMed  Article  Google Scholar 

  29. Crook, O. M., Mulvey, C. M., Kirk, P. D. W., Lilley, K. S. & Gatto, L. A Bayesian mixture modelling approach for spatial proteomics. PLoS Comput. Biol. 14, e1006516 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  30. Lee, S. Y. et al. APEX fingerprinting reveals the subcellular localization of proteins of interest. Cell Rep. 15, 1837–1847 (2016).

    CAS  PubMed  Article  Google Scholar 

  31. Liu, X. et al. An AP-MS- and BioID-compatible MAC-tag enables comprehensive mapping of protein interactions and subcellular localizations. Nat. Commun. 9, 1188 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  32. Go, C. D. et al. A proximity-dependent biotinylation map of a human cell. Nature 595, 120–124 (2021).

    CAS  PubMed  Article  Google Scholar 

  33. De Duve, C., Pressman, B. C., Gianetto, R., Wattiaux, R. & Appelmans, F. Tissue fractionation studies. 6. Intracellular distribution patterns of enzymes in rat-liver tissue. Biochem. J. 60, 604–617 (1955).

    PubMed Central  Article  Google Scholar 

  34. Dunkley, T. P., Watson, R., Griffin, J. L., Dupree, P. & Lilley, K. S. Localization of organelle proteins by isotope tagging (LOPIT). Mol. Cell. Proteom. 3, 1128–1134 (2004).

    CAS  Article  Google Scholar 

  35. Mulvey, C. M. et al. Using hyperLOPIT to perform high-resolution mapping of the spatial proteome. Nat. Protoc. 12, 1110–1135 (2017).

    CAS  PubMed  Article  Google Scholar 

  36. Liu, X. & Fagotto, F. A method to separate nuclear, cytosolic, and membrane-associated signaling molecules in cultured cells. Sci. Signal. 4, pl2 (2011).

    PubMed  Google Scholar 

  37. Gatto, L., Breckels, L. M. & Lilley, K. S. Assessing sub-cellular resolution in spatial proteomics experiments. Curr. Opin. Chem. Biol. 48, 123–149 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  38. Lund-Johansen, F. et al. MetaMass, a tool for meta-analysis of subcellular proteomics data. Nat. Methods 13, 837–840 (2016).

    CAS  PubMed  Article  Google Scholar 

  39. Binder, J. X. et al. COMPARTMENTS: unification and visualization of protein subcellular localization evidence. Database 2014, bau012 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  40. Orsburn, B. C. Proteome Discoverer—a community enhanced data processing suite for protein informatics. Proteomes 9, 15 (2021).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  41. Tyanova, S., Temu, T. & Cox, J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nat. Protoc. 11, 2301–2319 (2016).

    CAS  PubMed  Article  Google Scholar 

  42. Kong, A. T., Leprevost, F. V., Avtonomov, D. M., Mellacheruvu, D. & Nesvizhskii, A. I. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat. Methods 14, 513–520 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  43. Holman, J. D., Tabb, D. L. & Mallick, P. Employing ProteoWizard to convert raw mass spectrometry data. Curr. Protoc. Bioinforma. 46, 13.24.1-9 (2014).

    Article  Google Scholar 

  44. Kim, S. & Pevzner, P. A. MS-GF+ makes progress towards a universal database search tool for proteomics. Nat. Commun. 5, 5277 (2014).

    CAS  PubMed  Article  Google Scholar 

  45. Granholm, V. et al. Fast and accurate database searches with MS-GF+Percolator. J. Proteome Res. 13, 890–897 (2014).

    CAS  PubMed  Article  Google Scholar 

  46. Sturm, M. et al. OpenMS – an open-source software framework for mass spectrometry. BMC Bioinforma. 9, 163 (2008).

    Article  CAS  Google Scholar 

  47. Savitski, M. M., Wilhelm, M., Hahne, H., Kuster, B. & Bantscheff, M. A scalable approach for protein false discovery rate estimation in large proteomic data sets. Mol. Cell. Proteom. 14, 2394–2404 (2015).

    CAS  Article  Google Scholar 

  48. Platt, J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers (eds. Smola, A. J., Bartlett, P., Schölkopf, B. & Schuurmans, D.) (MIT Press, Cambridge, Massachusetts, USA, 1999).

  49. Branca, R. M. et al. HiRIEF LC-MS enables deep proteome coverage and unbiased proteogenomics. Nat. Methods 11, 59–62 (2014).

    CAS  PubMed  Article  Google Scholar 

  50. Bantscheff, M. et al. Robust and sensitive iTRAQ quantification on an LTQ Orbitrap mass spectrometer. Mol. Cell. Proteom. 7, 1702–1713 (2008).

    CAS  Article  Google Scholar 

  51. Ow, S. Y., Salim, M., Noirel, J., Evans, C. & Wright, P. C. Minimising iTRAQ ratio compression through understanding LC-MS elution dependence and high-resolution HILIC fractionation. Proteomics 11, 2341–2346 (2011).

    CAS  PubMed  Article  Google Scholar 

  52. Henderson, B. R. Nuclear-cytoplasmic shuttling of APC regulates β-catenin subcellular localization and turnover. Nat. Cell Biol. 2, 653–660 (2000).

    CAS  PubMed  Article  Google Scholar 

  53. Giurgiu, M. et al. CORUM: the comprehensive resource of mammalian protein complexes-2019. Nucleic Acids Res. 47, D559–D563 (2019).

    CAS  PubMed  Article  Google Scholar 

  54. Bastian, M., Heymann, S. & Jacomy, M. Gephi: an open source software for exploring and manipulating networks. Proceedings of the Third International AAAI Conference on Weblogs and Social Media. (The AAAI Press, Menlo Park, California, USA, 2009).

    Google Scholar 

Download references

Acknowledgements

This research was supported by funding from the Swedish Foundation for Strategic Research, Swedish Cancer Society, Swedish Research Council, Swedish Childhood Cancer Foundation, The Cancer Research Funds of Radiumhemmet and Stockholm’s County Council (ALF funding). We also thank O. Berkovska for critical reading of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

The study was conceived by T.A., Y.P., M.V., L.M.O. and J.L. Cell culturing, subcellular fractionation and MS analysis were performed by Y.P. and G.M. Bioinformatics related to the classification of protein localization was performed by T.A. R package building was performed by T.A. Classification output analysis and evaluation of data in relation to other resources were performed by T.A. and L.M.O. The paper was written by T.A., Y.P., M.V. and L.M.O. All authors contributed in finalizing the manuscript and approved the final version.

Corresponding authors

Correspondence to Lukas M. Orre or Janne Lehtiö.

Ethics declarations

Competing interests

J.L. reports receiving honoraria for invited speaker activities from Pfizer and Roche, institutional research support as a PI from AstraZeneca, Novartis and GE Healthcare. J.L. and L.M.O. are co-founders and shareholders of FenoMark Diagnostics AB.

Peer review

Peer review information

Nature Protocols thanks Josie Christopher, Markku Varjosalo and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key references using this protocol

Orre, L. M. et al. Mol. Cell 73, 166–182.e7 (2019): https://doi.org/10.1016/j.molcel.2018.11.035

Key data used in this protocol

Orre, L. M. et al. Mol. Cell 73, 166–182.e7 (2019): https://doi.org/10.1016/j.molcel.2018.11.035

Extended data

Extended Data Fig. 1 Samples generated during cell fractionation.

a, Duplicate dishes of cells were incubated with digitonin (42 μg/ml) on a tilt rocker at 4 °C for 7 min. b, After incubation, the digitonin solution was recovered as fraction FS1. c, Exaggerated digitonin treatment (>10 min) will disrupt chromatin, resulting in DNA ‘threads’ in the dish upon harvesting of cells. d, Zoom in on DNA ‘threads’. e, After harvesting by scraping into low-salt buffer, the cells are transferred into a Dounce homogenizer. Mechanical disruption of cells is accomplished by repeated strokes into the homogenizer by a tight-fitting glass pestle. One stroke equals one down-and-up movement of the pestle. The strokes should be performed below the liquid level to avoid foaming. f, Examples of FP1 pellets generated by low-speed centrifugation. g, Examples of FP2 pellets generated by medium-speed centrifugation. For some cell lines, FP2 may be difficult to see by the naked eye. h, Examples of soluble fractions (FS2) and pellets (FP3) generated by ultracentrifugation at 100,000g for 1 h. sup., supernatant.

Extended Data Fig. 2 MS data comparison of MS approaches.

a, Cumulative distribution plots showing the number of PSMs used for protein quantification for different MS approaches. Indicated in the plots is the percentage of quantifications that were based on at least three PSMs. b, Scatter plots displaying the number of PSMs used for quantification and fractionation profile correlation between replicates. c, Bar plots showing the number of proteins identified (left, 1% FDR, gene centric), the number of unique peptides (middle, 1% FDR) and the number of PSMs used for quantification (right) for the three different MS approaches used. corr., correlation.

Source data

Extended Data Fig. 3 Classification output and MS method comparison.

a, Bar plots indicating the number of compartment classifications for the three different MS approaches used. b, Box plots displaying the classification probabilities for compartment-level classifications for the three different MS methods used. c, Box plots showing the minimum number of PSMs used for quantification of proteins on the basis of neighborhood classification agreement between HiRIEF and high-pH strategies (left) or HiRIEF and long-gradient strategies (right). Wilcoxon signed-rank test (two-sided) was used to calculate P values. d, Bar plots indicating the neighborhood classification agreement between methods for proteins quantified with one to two PSMs or more than two PSMs. e, Bar plots showing number of classifications and classification frequency of proteins binned by quantitative range (maximum value through minimum value in fractionation profile) for different MS approaches. Quantitative data are binned into five portions ranging between 0–0.5, 0.5–1, 1–1.5, 1.5–2 and >2. The elements of the boxplots in the figure are as follows: center line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; points, outliers.

Source data

Extended Data Fig. 4 Comparison of classification output for different MS approaches.

a, Density plots showing the minimum number of PSMs used for quantifications of proteins. Density plots are colored on the basis of agreement between HiRIEF and high-pH MS strategies: yellow if protein classification is the same in both methods, red if it is different, blue if the protein is unclassified in one of the methods and green if unclassified in both methods. b, Scatter plot showing the maximum neighborhood classification probability score of HiRIEF and high-pH methods and the minimum number of PSMs used for quantification between HiRIEF and high-pH methods. Proteins are colored on the basis of classification agreement as described above (a). c, SubCellBarCodes for APC protein from the original five cell lines analyzed and available in the SubCellBarCode.org resource. The left figure indicates neighborhood-level classifications, and the right figure indicates compartment-level classifications. d, Scatter plots showing classification probability of proteins for HiRIEF and long-gradient MS strategies (left) and high-pH and long-gradient MS strategies (right). Proteins are colored on the basis of classification agreement: yellow indicates the same classification in both methods, red indicates a different classification, blue indicates proteins that were unclassified in one of the methods and green indicates proteins that were unclassified in both methods. class., classified; grad., gradient; Unclass., unclassified.

Source data

Supplementary information

Supplementary Information

Supplementary Method 1. BioConductor vignette for the SubCellBarCode R package

Reporting Summary

Supplementary Table 1

LC gradient lengths and strategy used for LC-MS analysis of the individual fractions (column A) generated by HiRIEF pre-fractionation in the pH ranges 3–10 (column B) and 3.4–4.8 (column C)

Supplementary Table 2

Relative quantitative data (TMT ratios) for the different MS datasets used in the current study. Data are represented in five different sheets: combined HiRIEF 3–10/3.4–4.8, HiRIEF 3–10, HiRIEF 3.4–4.8, high pH and long gradient. Each sheet includes the following columns: column A—gene symbol–centric protein ID, column B–L—TMT ratios for the five fractions (FS1, FS2 and FP1–3) in duplicate (A and B) and column M—minimum number of PSMs used for quantification for any of the 10 TMT channels

Supplementary Table 3

SubCellBarCode classification output for the different MS datasets used in the current study. Data are represented in five different sheets; combined HiRIEF 3–10/3.4–4.8, HiRIEF 3–10, HiRIEF 3.4–4.8, high pH and long gradient. Each sheet includes the following columns: column A—gene symbol–centric protein ID, column B—final neighborhood classification (SVMoutput), column C—final compartment classification (SVMoutput), columns D–G—SVM-derived probabilities for the individual neighborhoods and columns H–V—SVM-derived probabilities for the individual compartments

Source data

Source Data Fig. 4

Statistical and numerical source data

Source Data Fig. 5

Statistical and numerical source data

Source Data Fig. 6

Statistical and numerical source data

Source Data Fig. 7

statistical and numerical source data

Source Data Extended Data Fig. 2

Statistical and numerical source data

Source Data Extended Data Fig. 3

Statistical and numerical source data

Source Data Extended Data Fig. 4

Statistical and numerical source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Arslan, T., Pan, Y., Mermelekas, G. et al. SubCellBarCode: integrated workflow for robust spatial proteomics by mass spectrometry. Nat Protoc (2022). https://doi.org/10.1038/s41596-022-00699-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41596-022-00699-2

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing