A multi-scale map of cell structure fusing protein images and interactions

Qin, Yue; Huttlin, Edward L.; Winsnes, Casper F.; Gosztyla, Maya L.; Wacheul, Ludivine; Kelly, Marcus R.; Blue, Steven M.; Zheng, Fan; Chen, Michael; Schaffer, Leah V.; Licon, Katherine; Bäckström, Anna; Vaites, Laura Pontano; Lee, John J.; Ouyang, Wei; Liu, Sophie N.; Zhang, Tian; Silva, Erica; Park, Jisoo; Pitea, Adriana; Kreisberg, Jason F.; Gygi, Steven P.; Ma, Jianzhu; Harper, J. Wade; Yeo, Gene W.; Lafontaine, Denis L. J.; Lundberg, Emma; Ideker, Trey

doi:10.1038/s41586-021-04115-9

Article
Published: 24 November 2021

A multi-scale map of cell structure fusing protein images and interactions

Nature volume 600, pages 536–542 (2021)Cite this article

29k Accesses
34 Citations
374 Altmetric
Metrics details

Subjects

Abstract

The cell is a multi-scale structure with modular organization across at least four orders of magnitude¹. Two central approaches for mapping this structure—protein fluorescent imaging and protein biophysical association—each generate extensive datasets, but of distinct qualities and resolutions that are typically treated separately^2,3. Here we integrate immunofluorescence images in the Human Protein Atlas⁴ with affinity purifications in BioPlex⁵ to create a unified hierarchical map of human cell architecture. Integration is achieved by configuring each approach as a general measure of protein distance, then calibrating the two measures using machine learning. The map, known as the multi-scale integrated cell (MuSIC 1.0), resolves 69 subcellular systems, of which approximately half are to our knowledge undocumented. Accordingly, we perform 134 additional affinity purifications and validate subunit associations for the majority of systems. The map reveals a pre-ribosomal RNA processing assembly and accessory factors, which we show govern rRNA maturation, and functional roles for SRRM1 and FAM120C in chromatin and RPS3A in splicing. By integration across scales, MuSIC increases the resolution of imaging while giving protein interactions a spatial dimension, paving the way to incorporate diverse types of data in proteome-wide cell maps.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Overview of data fusion strategy.**

**Fig. 2: The multi-scale integrated cell.**

**Fig. 3: MuSIC captures subcellular components and diameters.**

**Fig. 4: Different data informs different scales of information.**

**Fig. 5: Exploration of MuSIC using physical and functional assays.**

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Pooled multicolour tagging for visualizing subcellular protein dynamics

Article Open access 19 April 2024

Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations

Article Open access 09 April 2024

Data availability

A web portal is available at http://nrnb.org/music with links to all major resources used for this study. These include the MuSIC map (https://doi.org/10.18119/N9188W); the immunofluorescence (HPA) and AP–MS data (BioPlex 2.0) on which the map is based; and data for the AP–MS pull-down experiments performed as follow-up. The new AP–MS data have also been included as part of the larger compendium of protein interactions in the next version of the BioPlex resource (BioPlex 3.0²⁹). AP–MS data, including filtered and unfiltered interaction lists as well as raw mass spectrometry data, are also available at http://bioplex.hms.harvard.edu. The image data and associated metadata can also be found in the HPA database (https://www.proteinatlas.org). The Gene Expression Omnibus (GEO) accession number for eCLIP data generated in this study is GSE171553. Source data are provided with this paper.

Code availability

The MuSIC pipeline is available at https://github.com/idekerlab/MuSIC along with a detailed step-by-step guide to building a MuSIC map.

References

Harold, F. M. Molecules into cells: specifying spatial architecture. Microbiol. Mol. Biol. Rev. 69, 544–564 (2005).
Article CAS Google Scholar
Mori, H. & Cardiff, R. D. Methods of immunohistochemistry and immunofluorescence: converting invisible to visible. In The Tumor Microenvironment, Methods in Molecular Biology Vol. 1458 (eds Ursini-Siegel, J. & Beauchemin, N.) 1–12 (Humana Press, 2016).
Aebersold, R. & Mann, M. Mass-spectrometric exploration of proteome structure and function. Nature 537, 347–355 (2016).
Article ADS CAS Google Scholar
Thul, P. J. et al. A subcellular map of the human proteome. Science 356, eaal3321 (2017).
Article Google Scholar
Huttlin, E. L. et al. Architecture of the human interactome defines protein communities and disease networks. Nature 545, 505–509 (2017).
Article ADS CAS Google Scholar
Schaffer, L. V. & Ideker, T. Mapping the multiscale structure of biological systems. Cell Syst. 12, 622–635 (2021).
Article CAS Google Scholar
Ouyang, W. et al. Analysis of the Human Protein Atlas Image Classification competition. Nat. Methods 16, 1254–1261 (2019).
Article CAS Google Scholar
Grover, A. & Leskovec, J. node2vec: scalable feature learning for networks. In KDD ’16: Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 855–864 (2016).
Goodfellow, I., Bengio, Y., Courville, A. & Bengio, Y. Deep Learning Vol. 1 (MIT Press, 2016).
Fortunato, S. & Hric, D. Community detection in networks: a user guide. Phys. Rep. 659, 1–44 (2016).
Article ADS MathSciNet Google Scholar
Go, C. D. et al. A proximity-dependent biotinylation map of a human cell. Nature 595, 120–124 (2021)
Article ADS CAS Google Scholar
Meyers, R. M. et al. Computational correction of copy number effect improves specificity of CRISPR–Cas9 essentiality screens in cancer cells. Nat. Genet. 49, 1779–1784 (2017).
Article CAS Google Scholar
Deckert, J. et al. Protein composition and electron microscopy structure of affinity-purified human spliceosomal B complexes isolated under physiological conditions. Mol. Cell. Biol. 26, 5528–5543 (2006).
Article CAS Google Scholar
Charenton, C., Wilkinson, M. E. & Nagai, K. Mechanism of 5′ splice site transfer for human spliceosome activation. Science 364, 362–367 (2019).
Article ADS CAS Google Scholar
Yoshikatsu, Y. et al. NVL2, a nucleolar AAA-ATPase, is associated with the nuclear exosome and is involved in pre-rRNA processing. Biochem. Biophys. Res. Commun. 464, 780–786 (2015).
Article CAS Google Scholar
Chaudhuri, S. et al. Human ribosomal protein L13a is dispensable for canonical ribosome function but indispensable for efficient rRNA methylation. RNA 13, 2224–2237 (2007).
Article CAS Google Scholar
Tafforeau, L. et al. The complexity of human ribosome biogenesis revealed by systematic nucleolar screening of pre-rRNA processing factors. Mol. Cell 51, 539–551 (2013).
Article CAS Google Scholar
Eppens, N. A. et al. Deletions in the S1 domain of Rrp5p cause processing at a novel site in ITS1 of yeast pre-rRNA that depends on Rex4p. Nucleic Acids Res. 30, 4222–4231 (2002).
Article CAS Google Scholar
De Silva, D., Tu, Y.-T., Amunts, A., Fontanesi, F. & Barrientos, A. Mitochondrial ribosome assembly in health and disease. Cell Cycle 14, 2226–2250 (2015).
Article Google Scholar
Blencowe, B. J. et al. The SRm160/300 splicing coactivator subunits. RNA 6, 111–120 (2000).
Article CAS Google Scholar
The UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47, D506–D515 (2019).
Article Google Scholar
Pavan Kumar, P. et al. Phosphorylation of SATB1, a global gene regulator, acts as a molecular switch regulating its transcriptional activity in vivo. Mol. Cell 22, 231–243 (2006).
Article CAS Google Scholar
Pomeranz Krummel, D. A., Oubridge, C., Leung, A. K. W., Li, J. & Nagai, K. Crystal structure of human spliceosomal U1 snRNP at 5.5 A resolution. Nature 458, 475–480 (2009).
Article ADS CAS Google Scholar
Fleckner, J., Zhang, M., Valcárcel, J. & Green, M. R. U2AF65 recruits a novel human DEAD box protein required for the U2 snRNP-branchpoint interaction. Genes Dev. 11, 1864–1872 (1997).
Article CAS Google Scholar
Van Nostrand, E. L. et al. A large-scale binding and functional map of human RNA-binding proteins. Nature 583, 711–719 (2020).
Article ADS Google Scholar
Van Nostrand, E. L. et al. Robust, cost-effective profiling of RNA binding protein targets with single-end enhanced crosslinking and immunoprecipitation (seCLIP). In mRNA Processing, Methods in Molecular Biology Vol. 1648 (ed. Shi, Y.) 177–200 (Humana Press, 2017).
Stryer, L. Fluorescence energy transfer as a spectroscopic ruler. Annu. Rev. Biochem. 47, 819–846 (1978).
Article CAS Google Scholar
Wang, T. et al. Gene essentiality profiling reveals gene networks and synthetic lethal interactions with oncogenic Ras. Cell 168, 890–903 (2017).
Article CAS Google Scholar
Huttlin, E. L. et al. Dual proteome-scale networks reveal cell-specific remodeling of the human interactome. Cell 184, 3022–3040 (2021).
Article CAS Google Scholar
Williams, S. G. & Hall, K. B. Human U2B″ protein binding to snRNA stemloops. Biophys. Chem. 159, 82–89 (2011).
Article CAS Google Scholar
Huang, G., Liu, Z., van der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. Preprint at https://arxiv.org/abs/1608.06993 (2016).
Nusinow, D. P. et al. Quantitative proteomics of the Cancer Cell Line Encyclopedia. Cell 180, 387–402 (2020).
Article CAS Google Scholar

Download references

Acknowledgements

We thank C. Ng, A. Palmer, Q. Zhang, Y. Quan, members of the laboratories of T.I. and E.L., the Human Protein Atlas and J. Swedlow for discussion and comments; M. Dow for helping us to improve the MuSIC GitHub repository and test the MuSIC pipeline; and the Cell Profiling facility and C. Stadler at the Science for Life Laboratory for help with in situ fractionation. This work was supported by the National Institutes of Health (NIH) under grants U54 CA209891, U01 MH115747, P41 GM103504 and R01 HG009979 to T.I., F99 CA264422 to Y.Q., U24 HG006673 to E.L.H., S.P.G. and J.W.H., U41 HG009889 and R01s HL137223 and HG004659 to G.W.Y. and R50 CA243885 to J.F.K.; by a gift from Google Ventures to J.W.H. and S.P.G.; by the Erling-Persson family foundation, Knut and Alice Wallenberg Foundation (2016.0204) and the Swedish Research Council (2017-05327) to E.L.; and by the Belgian Fonds de la Recherche Scientifique (F.R.S./FNRS), the Université Libre de Bruxelles (ULB), the European Joint Programme on Rare Diseases (‘RiboEurope’ and ‘DBAcure’), the Région Wallonne (SPW EER) (‘RIBOcancer’), the Internationale Brachet Stiftung and the Epitran COST action (CA16120) to D.L.J.L.

Author information

Authors and Affiliations

Department of Medicine, University of California San Diego, La Jolla, CA, USA
Yue Qin, Marcus R. Kelly, Fan Zheng, Michael Chen, Leah V. Schaffer, Katherine Licon, John J. Lee, Sophie N. Liu, Erica Silva, Jisoo Park, Adriana Pitea, Jason F. Kreisberg & Trey Ideker
Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA
Yue Qin, Gene W. Yeo & Trey Ideker
Department of Cell Biology, Harvard Medical School, Boston, MA, USA
Edward L. Huttlin, Laura Pontano Vaites, Tian Zhang, Steven P. Gygi & J. Wade Harper
Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, Stockholm, Sweden
Casper F. Winsnes, Anna Bäckström, Wei Ouyang & Emma Lundberg
Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, USA
Maya L. Gosztyla, Steven M. Blue & Gene W. Yeo
Stem Cell Program, University of California San Diego, La Jolla, CA, USA
Maya L. Gosztyla, Steven M. Blue & Gene W. Yeo
Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA
Maya L. Gosztyla, Steven M. Blue, Gene W. Yeo & Trey Ideker
RNA Molecular Biology, Fonds de la Recherche Scientifique (F.R.S./FNRS), Université Libre de Bruxelles (ULB), Charleroi-Gosselies, Belgium
Ludivine Wacheul & Denis L. J. Lafontaine
Institute for Artificial Intelligence, Peking University, Beijing, China
Jianzhu Ma
Department of Genetics, Stanford University, Stanford, CA, USA
Emma Lundberg
Chan Zuckerberg Biohub, San Francisco, San Francisco, CA, USA
Emma Lundberg
Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
Trey Ideker
Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
Trey Ideker

Authors

Yue Qin
View author publications
You can also search for this author in PubMed Google Scholar
Edward L. Huttlin
View author publications
You can also search for this author in PubMed Google Scholar
Casper F. Winsnes
View author publications
You can also search for this author in PubMed Google Scholar
Maya L. Gosztyla
View author publications
You can also search for this author in PubMed Google Scholar
Ludivine Wacheul
View author publications
You can also search for this author in PubMed Google Scholar
Marcus R. Kelly
View author publications
You can also search for this author in PubMed Google Scholar
Steven M. Blue
View author publications
You can also search for this author in PubMed Google Scholar
Fan Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Michael Chen
View author publications
You can also search for this author in PubMed Google Scholar
Leah V. Schaffer
View author publications
You can also search for this author in PubMed Google Scholar
Katherine Licon
View author publications
You can also search for this author in PubMed Google Scholar
Anna Bäckström
View author publications
You can also search for this author in PubMed Google Scholar
Laura Pontano Vaites
View author publications
You can also search for this author in PubMed Google Scholar
John J. Lee
View author publications
You can also search for this author in PubMed Google Scholar
Wei Ouyang
View author publications
You can also search for this author in PubMed Google Scholar
Sophie N. Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Erica Silva
View author publications
You can also search for this author in PubMed Google Scholar
Jisoo Park
View author publications
You can also search for this author in PubMed Google Scholar
Adriana Pitea
View author publications
You can also search for this author in PubMed Google Scholar
Jason F. Kreisberg
View author publications
You can also search for this author in PubMed Google Scholar
Steven P. Gygi
View author publications
You can also search for this author in PubMed Google Scholar
Jianzhu Ma
View author publications
You can also search for this author in PubMed Google Scholar
J. Wade Harper
View author publications
You can also search for this author in PubMed Google Scholar
Gene W. Yeo
View author publications
You can also search for this author in PubMed Google Scholar
Denis L. J. Lafontaine
View author publications
You can also search for this author in PubMed Google Scholar
Emma Lundberg
View author publications
You can also search for this author in PubMed Google Scholar
Trey Ideker
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.Q., E.L. and T.I. designed the study and developed the conceptual ideas. C.F.W. and W.O. generated image embeddings. Y.Q. and J.M. designed the data integration approach. Y.Q. and F.Z. designed the community detection approach. Y.Q., E.L.H., C.F.W., F.Z., L.V.S., W.O., J.P., A.P., J.F.K., J.M., J.W.H., E.L. and T.I. developed ideas for data analyses. Y.Q. implemented all computational methods and analyses. Y.Q., C.F.W., L.V.S., W.O., J.P. and T.I. organized the GitHub repository and wrote the step-by-step guide. Y.Q., E.L.H., C.F.W., M.R.K., L.P.V., E.S., J.F.K., S.P.G., J.W.H., G.W.Y., D.L.J.L., E.L. and T.I. designed validation experiments. E.L.H., L.P.V., T.Z., J.W.H. and S.P.G. generated and analysed AP–MS data and provided FLAG–HA-tagged clones. S.M.B. and G.W.Y. generated and analysed RIP–qPCR data. L.W. and D.L.J.L. generated and analysed northern blot data. C.F.W., A.B. and E.L. generated and analysed in situ fractionation data. M.L.G. and G.W.Y. generated and analysed eCLIP data. Y.Q., M.C., K.L. and J.J.L. performed the rest of the experiments. Y.Q., S.N.L. and T.I. designed the web portal page. Y.Q., E.L. and T.I. wrote the manuscript with input from all authors.

Corresponding authors

Correspondence to Emma Lundberg or Trey Ideker.

Ethics declarations

Competing interests

T.I. is a co-founder of Data4Cure, is on the Scientific Advisory Board and has an equity interest. T.I. is on the Scientific Advisory Board of Ideaya BioSciences and has an equity interest. G.W.Y is a co-founder, a member of the Board of Directors, on the Scientific Advisory Board, an equity holder and a paid consultant for Locanabio and Eclipse BioInnovations. G.W.Y is a visiting professor at the National University of Singapore. The terms of these arrangements have been reviewed and approved by the University of California San Diego in accordance with its conflict-of-interest policies. E.L is on the Scientific Advisory Boards of Cartography Biosciences, Nautilus Biotechnology and Interline Therapeutics, and has an equity interest in all of these. J.W.H. is a co-founder of Caraway Therapeutics, is on the Scientific Advisory Board and has an equity interest. J.W.H. is Founding Scientific Advisor for Interline Therapeutics.

Additional information

Peer review information Nature thanks Jason Swedlow and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Characterization of image data used in this study.

a, Histogram showing distribution in number of antibodies per protein over 661 proteins included in MuSIC. b, Histogram showing distribution in antibody quality scores over antibodies used in this study. c, Immunofluorescence images for alternative antibodies (columns) targeting the same protein (rows). Colours represent immunostained protein (green), cytoskeleton (red), or nucleus (blue). Images show high reproducibility for different antibodies against the same protein. d, Comparison of localizations for proteins in MuSIC (HEK293 cells, red) versus all proteins assayed by HPA in any cell line (grey). Localizations as defined by the HPA project⁴.

Extended Data Fig. 2 Embedding immunofluorescence images and AP–MS data.

a, Embedding immunofluorescence (IF) images using DenseNet. The 1024-dimension feature vector for each IF image was extracted from a DenseNet-121³¹ model trained to classify the IF image into one or several of 28 pre-defined protein localization classes from HPA. b, Two-dimensional visualization (UMAP, n_neighbours = 5) for the 1,451 image embeddings associated with the 661 proteins in MuSIC. c, Ability of different image embedding methods (coloured curves) to generate image-image similarities (cosine similarity) in agreement with protein-protein interactions in BioPlex 2.0. d, Node2vec⁸ workflow. The feature vector generated by node2vec captures the pattern of interaction neighbourhood for the respective node in input network. e, Embedding AP–MS data using node2vec. The input network to node2vec was constructed by treating each protein as a node and assigning edges between protein pairs that were identified as physically interacting in the AP–MS data. The two-dimensional visualization (UMAP, n_neighbours = 5) for AP–MS embeddings associated with 661 proteins in MuSIC is shown at right. f, Network showing all proteins (grey) that physically interact with SNRPC and SNRPB2 (blue) in BioPlex 2.0. SNRPC and SNRPB2 do not physically interact, but the cosine similarity of their embedded features is 0.93 due to shared interaction neighbourhood. In many cases of two proteins with high node2vec similarity but without direct interaction in AP–MS data, we found that neither protein had yet been tagged as bait for an affinity purification experiment. In these cases, the node2vec embedding suggests gaps in existing AP–MS data. g, Ability of different AP–MS embedding methods to generate protein-protein similarities (cosine similarity) in agreement with protein pairwise similarities computed from HPA images.

Extended Data Fig. 3 Fusing protein distances from immunofluorescence and affinity purification.

a, b, Protein pairs ranked by similarity in AP–MS embedding enrich for the most similar protein pairs in IF (a), and vice versa (b). c, Calibrating physical diameter, D, of subcellular components against the number of proteins, C, assigned to the corresponding Gene Ontology (GO) terms. d, Supervised model (random forest) estimates physical proximity (nm) of all pairs of proteins from their IF and AP–MS embeddings. e, Performance of model in recovering protein-protein distances in GO in five-fold cross validation (red, Pearson’s r). Equivalent calculation for random feature sets (grey). Statistics calculated using two-sided paired t-test. Data are presented as mean values +/- standard deviation.

Extended Data Fig. 4 Selection of parameters for community detection.

a, Using multi-scale community detection, protein systems of increasing sizes are discovered as the threshold for protein-protein distance is progressively increased. b, CliXO community detection has four parameters (depth 𝛼, y-axis; breadth β, x-axis; minimum modularity m and modularity significance z, red circle backslash) that affect the sensitivity with which communities are identified and thus the size of the hierarchy. c, d, Dot plots in which each dot is a community hierarchy generated with a particular set of parameters. The selection for MuSIC is highlighted in red. This selection was among several that were optimal based on enrichment for protein-protein interactions in Human Cell Map (c) and co-essentialities from DepMap (d). Examples of other parameter sets are shown in blue. e, Map from Fig. 2 with system colour showing enrichment for co-essentialities among protein pairs that are specific to that system. Enrichment of each system is assessed empirically, using 1,000 randomized hierarchies, followed by Benjamini–Hochberg multiple test correction to obtain FDR (orange gradient).

Extended Data Fig. 5 Supporting analyses for PRRPA.

a, Distributions of protein-protein distance z-scores among the seven proteins in the PRRPA system for IF (top, red) or AP–MS (bottom, blue) modalities, calibrated to all such distances, respectively (grey). Statistics calculated using one-sided Mann–Whitney U test. b, Specific recovery of new AP–MS interactions within PRRPA is shown (dark blue bar), in comparison to interactions between proteins in PRRPA and other proteins organized under the same parent systems (“Ribosome” and “Ribosome biogenesis assembly”, light blue bar), or between proteins in PRRPA and those organized elsewhere in MuSIC (grey bar). c, Mature 28S/18S rRNA ratio under siRNAs targeting each PRRPA protein (green) versus scrambled siRNA (grey), n = 3 biological replicates. FDR from two-sided t-test with Benjamini–Hochberg correction. Data are presented as mean values +/- standard deviation. d–i, Western blot analysis (d, e, Simple western assay; f–i, SDS–PAGE) of target protein abundance after treating HEK293T cells with respective siRNA for 72 h (Supplementary Tables 6, 7). The siRNAs highlighted in red were selected to assess the perturbation of mature rRNA ratio (28S/18S rRNA) when knocking down target protein, with protein knockdown efficiency confirmed using western blot in three additional biological replicates. For source data, see Supplementary Fig. 1 (gel; d–i) and Supplementary Fig. 2 (total RNA profiles; c).

Extended Data Fig. 6 Supporting analyses for ribosomal systems.

a, Categorization of proteins in “Ribosome biogenesis community” by whether they have been previously identified in human ribosome biogenesis. Excludes PRRPA proteins described in Fig. 5b–d. b, Structure of human pre-rRNA and probes used for northern blot. In eukaryotes, 3 out of 4 mature rRNAs (18S, 5.8S, and 28S rRNAs) are produced from a single long polycistronic precursor (47S) synthesized by RNA polymerase I. The mature rRNAs are interspersed with the 5′ and 3′ external transcribed spacers (ETS) and internal transcribed spacer (ITS) 1 and 2. The probes used in the northern blot (5′-ETS, ITS1, and ITS2) are indicated and colour-coded. c, Total RNA extracted from the indicated cell line, which was transfected with a DsiRNA specific to the target protein for 72 h and analysed by northern blotting with probes specific to the 5′-ETS, ITS1, and ITS2 sequences (Supplementary Table 8). As controls, cells were either untreated, transfected with a scrambled silencer, or transfected with a silencer targeting UTP18 (positive control involved in small ribosomal subunit biogenesis). Heat map colour shows the percentage of each pre-rRNA species with respect to the scramble control. For gel source data, see Supplementary Fig. 1. d, For protein baits in new AP–MS experiments (x axis), fraction of interacting preys that fall within the Ribosome biogenesis community (blue bars) versus elsewhere (grey bars). Only new AP–MS interactions are considered for this analysis. RNPS1 does not belong to Ribosome biogenesis community and serves as a negative control. e, IF images showing similar cytoplasmic staining for proteins in “Mito-cyto ribosomal cluster.” Cytoplasmic staining is dim for MRPS9, MRPS14 and MRPS31 compared to their predominant mitochondrial locations. Colours represent immunostained protein (green), cytoskeleton (red) and nucleus (blue). f, g, Corresponding distributions of protein-protein distance z-scores for IF (f, red) or AP–MS (g, blue), calibrated to all such distances, respectively (grey). Statistics calculated using one-sided Mann–Whitney U test. h, Two-dimensional projection of proteins in Mito-cyto ribosomal cluster, as in Fig. 5f. Proteins coloured according to known affiliations to cytoplasmic ribosome or mitochondrial ribosome. i, Validated AP–MS interactions in Mito-cyto ribosomal cluster. Note that only one out of seven proteins was previously tagged as bait in BioPlex 2.0 (light blue node), thus most physical associations (dark blue edges) among protein pairs were newly identified in this study.

Extended Data Fig. 7 Supporting analyses for chromatin regulation and splicing systems.

a, IF images showing similar nucleoplasm and nuclear speckles signals among proteins in the “Chromatin regulation complex.” Colours represent immunostained protein (green) and cytoskeleton (red). b, Distributions of pairwise protein distance z-scores among the proteins in the Chromatin regulation complex for IF (top, red) or AP–MS (bottom, blue) modalities, calibrated to all such distances, respectively (grey). Statistics calculated using one-sided Mann–Whitney U test. c, Immunofluorescent proteins (rows) imaged in HEK293 cells, untreated (left) or treated (right) with in situ fractionation to remove soluble cytoplasmic and loosely held nuclear proteins. Chromatin-binding proteins remain after treatment. Blue, nucleus; other colours as in a. For image source data, see Supplementary Fig. 3. d, IF images showing similar nucleoplasm signals among proteins in “RNA splicing complex 3.” e, Similar display for RNA splicing complex 3 as in b. f, Comparison of 500 top differentially expressed mRNAs (absolute fold change) resulting from shRNA knockdown of each of five genes (see Supplementary Table 9 for file accessions). Bar chart shows number of differential mRNAs shared by different gene groups indicated by black dots beneath each bar. One-sided one-sample t-test. g, Comparison among the top 10 pathways (Gene Ontology Biological Process) returned from Gene Set Enrichment Analysis using the top 500 differentially expressed transcripts. Bar chart shows number of enriched pathways shared by different gene groups indicated by black dots beneath each bar. One-sided one-sample t-test. h, eCLIP workflow. RBP, RNA-binding protein. NGS, next generation sequencing.

Extended Data Fig. 8 Supporting analyses for Discussion.

a, b, Examples of proteins with strong AP–MS protein interactions that have very different IF localization patterns. Colours represent immunostained protein (green) and cytoskeleton (red). c, Degree of co-essentiality for gene pairs within PRRPA (teal bar) shown in comparison to remaining pairs of genes assigned to the more general system that contains it, “Ribosome biogenesis community” (green bar), as well as all other gene pairs in MuSIC (grey bar). d, Similar analysis as in (c) for “RNA splicing complex 3.” Parent systems are “RNA processing complex 1” and “RNA splicing complex family.” e, Protein co-abundance for MuSIC systems, calculated from the median Pearson correlation of pairwise protein abundance over 375 diverse cell lines³². The plot shows all systems with fewer than 20 proteins and co-abundance measurements for >50% of protein pairs. Significance is assessed empirically (one-sided), using 1,000 randomized MuSIC hierarchies, followed by Benjamini–Hochberg multiple test correction to obtain FDR (colour of bar). Protein co-abundance for a system provides evidence for its presence in cell types beyond HEK293.

Supplementary information

Supplementary Figures

This file is consisted of three supplementary figures and provides the source gel data (Supplementary Figure 1), total RNA profiles (Supplementary Figure 2) and source in situ fractionation data (Supplementary Figure 3).

Reporting Summary

Supplementary Methods

This file contains Supplementary Methods and Supplementary References.

Peer Review File

Supplementary Table 1

MuSIC proteins and associated data.

Supplementary Table 2

Literature collection of subcellular components used for calibrating physical diameter (related to Extended Data Fig. 3c).

Supplementary Table 3

MuSIC systems and associated data.

Supplementary Table 4

Literature collection of subcellular components used for validating MuSIC estimated diameter (related to Fig. 3b).

Supplementary Table 5

866 reproducible and significant (IDR cut-off of 0.01, Fisher’s Exact test P ≤ 0.001, fold enrichment ≥8) eCLIP peaks of RPS3A.

Supplementary Table 6

Sequences of siRNA and DsiRNA used in this study.

Supplementary Table 7

Antibodies used in this study.

Supplementary Table 8

Sequences of northern blot probes used for pre-rRNA analysis (related to Fig. 5e and Extended Data Fig. 6b, c).

Supplementary Table 9

ENCODE file accessions used for RNA-seq analysis (related to Extended Data Fig. 7f, g).

Source data

Source Data Fig. 5

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qin, Y., Huttlin, E.L., Winsnes, C.F. et al. A multi-scale map of cell structure fusing protein images and interactions. Nature 600, 536–542 (2021). https://doi.org/10.1038/s41586-021-04115-9

Download citation

Received: 18 June 2020
Accepted: 08 October 2021
Published: 24 November 2021
Issue Date: 16 December 2021
DOI: https://doi.org/10.1038/s41586-021-04115-9

This article is cited by

A deep learning model of tumor cell architecture elucidates response and resistance to CDK4/6 inhibitors
- Sungjoon Park
- Erica Silva
- Trey Ideker
Nature Cancer (2024)
Single-cell transcriptomic analysis uncovers the origin and intratumoral heterogeneity of parotid pleomorphic adenoma
- Xiuyun Xu
- Jiaxiang Xie
- Cheng Wang
International Journal of Oral Science (2023)
SCS: cell segmentation for high-resolution spatial transcriptomics
- Hao Chen
- Dongshunyi Li
- Ziv Bar-Joseph
Nature Methods (2023)
Applications of graph theory in studying protein structure, dynamics, and interactions
- Ziyun Zhou
- Guang Hu
Journal of Mathematical Chemistry (2023)
Scientific discovery in the age of artificial intelligence
- Hanchen Wang
- Tianfan Fu
- Marinka Zitnik
Nature (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.