Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

The Cancer Surfaceome Atlas integrates genomic, functional and drug response data to identify actionable targets

Abstract

Cell-surface proteins (SPs) are a rich source of immune and targeted therapies. By systematically integrating single-cell and bulk genomics, functional studies and target actionability, in the present study we comprehensively identify and annotate genes encoding SPs (GESPs) pan-cancer. We characterize GESP expression patterns, recurrent genomic alterations, essentiality, receptor–ligand interactions and therapeutic potential. We also find that mRNA expression of GESPs is cancer-type specific and positively correlates with protein expression, and that certain GESP subgroups function as common or specific essential genes for tumor cell growth. We also predict receptor–ligand interactions substantially deregulated in cancer and, using systems biology approaches, we identify cancer-specific GESPs with therapeutic potential. We have made this resource available through the Cancer Surfaceome Atlas (http://fcgportal.org/TCSA) within the Functional Cancer Genome data portal.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Definition of the human surfaceome on a genome-wide scale.
Fig. 2: Expression of GESPs across healthy normal tissues and primary tumor specimens.
Fig. 3: Identification of GESPs that are specifically expressed in cancers.
Fig. 4: Evaluation of GESP combinations for logic-gated CAR-T design.
Fig. 5: Characterization of recurrent genomic alterations of GESPs across cancers.
Fig. 6: Characterization of receptor–ligand interactions of the GESPs in cancers.
Fig. 7: Characterization of mIAMs in cancers.
Fig. 8: Evaluation of GESPs as therapeutic targets in anticancer drug development.

Similar content being viewed by others

Data availability

The present study is based on genomic profiles generated by TCGA project, which was supported by the NCI and the National Human Genome Research Institute (http://cancergenome.nih.gov). TCGA profiling data are publicly available through TCGA data portal (https://tcga-data.nci.nih.gov/tcga), the Genomic Data Commons portal (GDC, https://gdc-portal.nci.nih.gov), the GDAC Firehose of the Broad Institute (http://gdac.broadinstitute.org), the UCSC Toil RNAseq Recompute Compendium (https://xenabrowser.net/datapages/?hub=https://toil.xenahubs.net:443), TCGA Multi-Center Mutation Calling in Multiple Cancers (MC3) project (https://doi.org/10.7303/syn7214402) and TumorFusions data portal (http://tumorfusions.org/). Proteomics profiles were generated by the NCI’s CPTAC (https://proteomics.cancer.gov/programs/cptac). The CPTAC profiling data are publicly available through the CPTAC data portal (https://cptac-data-portal.georgetown.edu). CRISPR–Cas9 screening profiles in human cancer cell lines are publicly available through the DepMap portal (https://depmap.org/portal) and the Score projects (https://doi.org/10.6084/m9.figshare.c.5289226.v1). ScRNA-seq data are available through http://blueprint.lambrechtslab.org (breast invasive carcinoma, colon adenocarcinoma and ovarian serous cystadenocarcinoma), http://ureca-singlecell.kr (bladder urothelial carcinoma), https://bigd.big.ac.cn/bioproject/browse/PRJCA001063 (pancreatic adenocarcinoma), https://dna-discovery.stanford.edu/research/datasets (follicular lymphoma, and stomach adenocarcinoma), https://science.sciencemag.org/highwire/filestream/713964/field_highwire_adjunct_files/6/aat1699_DataS1.gz.zip (kidney renal clear cell carcinoma) and Gene Expression Omnibus (accession nos. GSE125449, GSE131907, GSE131928 and GSE139829) (cholangiocarcinoma and liver hepatocellular carcinoma, lung adenocarcinoma, glioblastoma multiforme and uveal melanoma), respectively. The data generated by the present study are publicly available through the FCG data portal (http://fcgportal.org/fcgtcsa). All other data supporting the findings of the present study are available from the corresponding author on reasonable request. Source data are provided with this paper.

Code availability

The code for analysis of TCSA is available at https://github.com/fcgportal/TCSA.

References

  1. Wu, C. C. & Yates, J. R. 3rd The application of mass spectrometry to membrane proteomics. Nat. Biotechnol. 21, 262–267 (2003).

    Article  CAS  PubMed  Google Scholar 

  2. Daley, D. O. et al. Global topology analysis of the Escherichia coli inner membrane proteome. Science 308, 1321–1323 (2005).

    Article  CAS  PubMed  Google Scholar 

  3. Almen, M. S., Nordstrom, K. J., Fredriksson, R. & Schioth, H. B. Mapping the human membrane proteome: a majority of the human membrane proteins can be classified according to function and evolutionary origin. BMC Biol. 7, 50 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  4. da Cunha, J. P. et al. Bioinformatics construction of the human cell surfaceome. Proc. Natl Acad. Sci. USA 106, 16752–16757 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Bausch-Fluck, D. et al. The in silico human surfaceome. Proc. Natl Acad. Sci. USA 115, E10988–E10997 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Brown, K. K. et al. Approaches to target tractability assessment—a practical perspective. Medchemcomm 9, 606–613 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Adams, G. P. & Weiner, L. M. Monoclonal antibody therapy of cancer. Nat. Biotechnol. 23, 1147–1157 (2005).

    Article  CAS  PubMed  Google Scholar 

  8. Lim, W. A. & June, C. H. The principles of engineering immune cells to treat cancer. Cell 168, 724–740 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Sadelain, M., Riviere, I. & Riddell, S. Therapeutic T cell engineering. Nature 545, 423–431 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Carter, P. J. & Lazar, G. A. Next generation antibody drugs: pursuit of the ‘high-hanging fruit’. Nat. Rev. Drug Discov. 17, 197–223 (2018).

    Article  CAS  PubMed  Google Scholar 

  11. MacKay, M. et al. The therapeutic landscape for cells engineered with chimeric antigen receptors. Nat. Biotechnol. 38, 233–244 (2020).

    Article  CAS  PubMed  Google Scholar 

  12. Weber, E. W., Maus, M. V. & Mackall, C. L. The emerging landscape of immune cell therapies. Cell 181, 46–62 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Garraway, L. A. & Lander, E. S. Lessons from the cancer genome. Cell 153, 17–37 (2013).

    Article  CAS  PubMed  Google Scholar 

  14. Vogelstein, B. et al. Cancer genome landscapes. Science 339, 1546–1558 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Town, J. et al. Exploring the surfaceome of Ewing sarcoma identifies a new and unique therapeutic target. Proc. Natl Acad. Sci. USA 113, 3603–3608 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Ghosh, D. et al. A cell-surface membrane protein signature for glioblastoma. Cell Syst. 4, 516–529 e517 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Perna, F. et al. Integrating proteomics and transcriptomics for systematic combinatorial chimeric antigen receptor therapy of AML. Cancer Cell 32, 506–519.e505 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Lee, J. K. et al. Systemic surfaceome profiling identifies target antigens for immune-based therapy in subtypes of advanced prostate cancer. Proc. Natl Acad. Sci. USA 115, E4473–E4482 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Coscia, F. et al. Multi-level proteomics identifies CT45 as a chemosensitivity mediator and immunotherapy target in ovarian cancer. Cell 175, 159–170 e116 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Yao, W. et al. Syndecan 1 is a critical mediator of macropinocytosis in pancreatic cancer. Nature 568, 410–414 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Consortium, G. T. et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

    Article  Google Scholar 

  22. Hutter, C. & Zenklusen, J. C. The cancer genome atlas: creating lasting value beyond its data. Cell 173, 283–285 (2018).

    Article  CAS  PubMed  Google Scholar 

  23. Meyers, R. M. et al. Computational correction of copy number effect improves specificity of CRISPR–Cas9 essentiality screens in cancer cells. Nat. Genet. 49, 1779–1784 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Tsherniak, A. et al. Defining a cancer dependency map. Cell 170, 564–576 e516 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Behan, F. M. et al. Prioritization of cancer therapeutic targets using CRISPR–Cas9 screens. Nature 568, 511–516 (2019).

    Article  CAS  PubMed  Google Scholar 

  26. Dwane, L. et al. Project Score database: a resource for investigating cancer cell dependencies and prioritizing therapeutic targets. Nucleic Acids Res. 49, D1365–D1372 (2021).

    Article  CAS  PubMed  Google Scholar 

  27. Pacini, C. et al. Integrated cross-study datasets of genetic dependencies in cancer. Nat. Commun. 12, 1661 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Carvalho-Silva, D. et al. Open targets platform: new developments and updates two years on. Nucleic Acids Res. 47, D1056–D1065 (2019).

    Article  CAS  PubMed  Google Scholar 

  29. Kim, M. S. & Yi, G. S. HMPAS: human membrane protein analysis system. Proteome Sci 11, S7 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Binder, J. X. et al. COMPARTMENTS: unification and visualization of protein subcellular localization evidence. Database 2014, bau012 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Bausch-Fluck, D. et al. A mass spectrometric-derived cell surface protein atlas. PLoS ONE 10, e0121314 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Dobson, L., Lango, T., Remenyi, I. & Tusnady, G. E. Expediting topology data gathering for the TOPDB database. Nucleic Acids Res. 43, D283–D289 (2015).

    Article  CAS  PubMed  Google Scholar 

  33. Fonseca, A. L. et al. Bioinformatics analysis of the human surfaceome reveals new targets for a variety of tumor types. Int. J. Genom. 2016, 8346198 (2016).

    Google Scholar 

  34. Thul, P. J. et al. A subcellular map of the human proteome. Science https://doi.org/10.1126/science.aal3321 (2017).

  35. Pais, H. et al. Surfaceome interrogation using an RNA-seq approach highlights leukemia initiating cell biomarkers in an LMO2 T cell transgenic model. Sci. Rep. 9, 5760 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Yanai, I. et al. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics 21, 650–659 (2005).

    Article  CAS  PubMed  Google Scholar 

  37. van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).

    Google Scholar 

  38. Hofmann, O. et al. Genome-wide analysis of cancer/testis gene expression. Proc. Natl Acad. Sci. USA 105, 20422–20427 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Wang, C. et al. Systematic identification of genes with a cancer-testis expression pattern in 19 cancer types. Nat. Commun. 7, 10499 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Uhlen, M. et al. Proteomics. Tissue-based map of the human proteome. Science 347, 1260419 (2015).

    Article  PubMed  Google Scholar 

  41. Davis, C. A. et al. The Encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res. 46, D794–D801 (2018).

    Article  CAS  PubMed  Google Scholar 

  42. Labanieh, L., Majzner, R. G. & Mackall, C. L. Programming CAR-T cells to kill cancer. Nat. Biomed. Eng. 2, 377–391 (2018).

    Article  CAS  PubMed  Google Scholar 

  43. Wu, M. R., Jusiak, B. & Lu, T. K. Engineering advanced cancer therapies with synthetic biology. Nat. Rev. Cancer 19, 187–195 (2019).

    PubMed  Google Scholar 

  44. Rafiq, S., Hackett, C. S. & Brentjens, R. J. Engineering strategies to overcome the current roadblocks in CAR T cell therapy. Nat. Rev. Clin. Oncol. 17, 147–167 (2020).

    Article  PubMed  Google Scholar 

  45. Dannenfelser, R. et al. Discriminatory power of combinatorial antigen recognition in cancer T cell therapies. Cell Syst. 11, 215–228 e215 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Williams, J. Z. et al. Precise T cell recognition programs designed by transcriptionally linking multiple receptors. Science 370, 1099–1104 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Kloss, C. C., Condomines, M., Cartellieri, M., Bachmann, M. & Sadelain, M. Combinatorial antigen recognition with balanced signaling promotes selective tumor eradication by engineered T cells. Nat. Biotechnol. 31, 71–75 (2013).

    Article  CAS  PubMed  Google Scholar 

  48. Roybal, K. T. et al. Precision tumor recognition by T cells with combinatorial antigen-sensing circuits. Cell 164, 770–779 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Fedorov, V. D., Themeli, M. & Sadelain, M. PD-1- and CTLA-4-based inhibitory chimeric antigen receptors (iCARs) divert off-target immunotherapy responses. Sci. Transl. Med. 5, 215ra172 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Hu, Z. et al. Genomic characterization of genes encoding histone acetylation modulator proteins identifies therapeutic targets for cancer treatment. Nat. Commun. 10, 733 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Shan, W. et al. Systematic characterization of recurrent genomic alterations in cyclin-dependent kinases reveals potential therapeutic strategies for cancer treatment. Cell Rep. 32, 107884 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Thorndike, R. L. Who belongs in the family? Psychometrika 18, 267–276 (1953).

    Article  Google Scholar 

  53. Goutte, C., Toft, P., Rostrup, E., Nielsen, F. & Hansen, L. K. On clustering fMRI time series. NeuroImage 9, 298–310 (1999).

    Article  CAS  PubMed  Google Scholar 

  54. Carter, S. L. et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, 413–421 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Hu, X. et al. TumorFusions: an integrative resource for cancer-associated transcript fusions. Nucleic Acids Res. 46, D1144–D1149 (2018).

    Article  CAS  PubMed  Google Scholar 

  56. Graeber, T. G. & Eisenberg, D. Bioinformatic identification of potential autocrine signaling loops in cancers from gene expression profiles. Nat. Genet. 29, 295–300 (2001).

    Article  CAS  PubMed  Google Scholar 

  57. Ben-Shlomo, I., Yu Hsu, S., Rauch, R., Kowalski, H. W. & Hsueh, A. J. Signaling receptome: a genomic and evolutionary perspective of plasma membrane receptors involved in signal transduction. Sci. STKE 2003, RE9 (2003).

    Article  PubMed  Google Scholar 

  58. Kahlon, K. S. et al. Specific recognition and killing of glioblastoma multiforme by interleukin 13-zetakine redirected cytolytic T cells. Cancer Res. 64, 9160–9166 (2004).

    Article  CAS  PubMed  Google Scholar 

  59. Benedict, S. H., Cool, K. M., Dotson, A. L. & Chan, M. A. in Encyclopedia of Life Sciences (ed John Wiley & Sons Ltd) https://doi.org/10.1002/9780470015902.a0000923.pub2 (John Wiley & Sons Ltd, 2007).

  60. Ghandi, M. et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Liu, H. et al. Tumor-derived IFN triggers chronic pathway agonism and sensitivity to ADAR loss. Nat. Med. 25, 95–102 (2019).

    Article  CAS  PubMed  Google Scholar 

  63. Efremova, M., Vento-Tormo, M., Teichmann, S. A. & Vento-Tormo, R. CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes. Nat. Protoc. 15, 1484–1506 (2020).

    Article  CAS  PubMed  Google Scholar 

  64. Oprea, T. I. et al. Unexplored therapeutic opportunities in the human genome. Nat. Rev. Drug Discov. 17, 317–332 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Rhee, S. Y., Wood, V., Dolinski, K. & Draghici, S. Use and misuse of the gene ontology annotations. Nat. Rev. Genet. 9, 509–515 (2008).

    Article  CAS  PubMed  Google Scholar 

  66. Vivian, J. et al. Toil enables reproducible, open source, big biomedical data analyses. Nat. Biotechnol. 35, 314–316 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Zhang, L. et al. MNX1 Is oncogenically upregulated in African-American prostate cancer. Cancer Res. 76, 6290–6298 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Xiao, S. J., Zhang, C., Zou, Q. & Ji, Z. L. TiSGeD: a database for tissue-specific genes. Bioinformatics 26, 1273–1275 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Jain, A. & Tuteja, G. TissueEnrich: tissue-specific gene enrichment analysis. Bioinformatics 35, 1966–1967 (2019).

    Article  CAS  PubMed  Google Scholar 

  70. Dougherty, J. D., Schmidt, E. F., Nakajima, M. & Heintz, N. Analytical approaches to RNA profiling data for the identification of genes enriched in specific cells. Nucleic Acids Res. 38, 4218–4230 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Torsten, H., Kurt, H., Mark, A. v. d. W. & Achim, Z. A lego system for conditional inference. Am. Stat. 60, 257–263 (2006).

    Article  Google Scholar 

  73. Leiserson, M. D., Wu, H. T., Vandin, F. & Raphael, B. J. CoMEt: a statistical approach to identify combinations of mutually exclusive alterations in cancer. Genome Biol 16, 160 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  74. Dempster, J. M. et al. Extracting biological insights from the project achilles genome-Scale CRISPR screens in cancer cell lines. Preprint at bioRxiv https://doi.org/10.1101/720243 (2019).

  75. Dempster, J. M. et al. Agreement between two large pan-cancer CRISPR–Cas9 gene dependency data sets. Nat. Commun. 10, 5817 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Hou, R., Denisenko, E. & Forrest, A. R. R. scMatch: a single-cell gene expression profile annotation tool using reference datasets. Bioinformatics 35, 4688–4695 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Wan, C. et al. LTMG: a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data. Nucleic Acids Res. 47, e111 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Wang, J., Wen, S., Symmans, W. F., Pusztai, L. & Coombes, K. R. The bimodality index: a criterion for discovering and ranking bimodal signatures from cancer gene expression profiling data. Cancer Inform. 7, 199–216 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The present study was supported, in whole or in part, by the grants from the Pennsylvania Department of Health, Harry Fields Professorship and Abramson Cancer Center. L.Z. was supported by the Basser Center for BRCA and US National Institutes for Health (NIH) grants (nos. R01CA142776, R01CA190415, R01CA225929, R01CA262070, P50CA083638 and P50CA174523). R.H.V. was supported by NIH grants (nos. P01CA210944 and R01CA229803). X.H. was supported by the Ovarian Cancer Research Alliance. X.H. and Y.Z. were supported by the Foundation for Women’s Cancer. Support of the core facilities was provided by an NIH Cancer Centre support grant (no. P30CA016520) to Abramson Cancer Center.

Author information

Authors and Affiliations

Authors

Contributions

Z.H., J.Y., X.H., R.H.V. and L.Z. conceived and designed the research. Z.H. and J.Y. performed the computational analysis and statistical computations. M.L., J.J., Y.Z., T.Z., M.X., F.Y. J.L.T., K.T.M., O.T. and H.M.C. performed raw data collection, dataset integration and general discussion on genomics, immunology, cancer pathology and drug discovery. Z.H., J.Y., X.H., R.H.V. and L.Z. wrote the paper.

Corresponding authors

Correspondence to Xiaowen Hu, Robert H. Vonderheide or Lin Zhang.

Ethics declarations

Competing interests

L.Z. and X.H. report having received research funding from AstraZeneca, Bristol-Myers Squibb/Celgene and Prelude Therapeutics. R.H.V. is an inventor on a licensed patent relating to cancer cellular immunotherapy and receives royalties from Children’s Hospital Boston for a licensed research-only monoclonal antibody. O.T. and H.M.C. are employees of AstraZeneca. The remaining authors declare no competing interests.

Additional information

Peer review information Nature Cancer thanks Francesco Iorio and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Tissue specificity of GESPs across normal tissues.

a, Percentages of genes which were detectable (median FPKM value >1) by RNA-seq analysis in 0–6, 7–23, and 24–30 tissue types. Red: GESPs; and gray: non-GESPs. b, The percentages of genes detectable in 0–6, 7–23, and 24–30 tissue types, stratified by subcellular location of gene products. c, Bar plot (left) and bubble plot (right) show enrichment of tissue type-specific genes in the corresponding subgroups based on subcellular location of gene products. P-values were calculated by two-sided Fisher’s exact test. Purple, enriched; green, depleted. The size of the bubble: absolute value of log2(odds ratio).

Source data

Extended Data Fig. 2 GESPs specifically express in distinct cell populations in tumor microenvironment.

a, Bubble plots show expression levels and percentages of the cells expressing ERBB3, EGFR, MET, PDGFRB, KDR, CTLA4, CSF1R, or IL2RB in each cell population across 13 cancer types. Bubble size: percentage of positive cells; intensity of color: expression level. b, Violin plots show gene expression levels of ERBB3, EGFR, MET, PDGFRB, KDR, CTLA4, CSF1R, and IL2RB in each cell population at single cell. Each plot presents expression level in one cell.

Extended Data Fig. 3 Distribution of correlation coefficient between protein and mRNA expression levels.

The empirical null distribution for correlation of mRNA and protein generated by permutating samples is shown for comparison (all p-values < 2.2 × 10–16, two-sided Wilcoxon rank-sum test).

Extended Data Fig. 4 Characteristic of caGESPs stratified by specificity score.

Up panel: pie charts showing percentages of caGESPs in each tier stratified by specificity score. All caGESPs have specificity score ≥ 3. Bottom panel: dot plot showing the relative contribution of each algorithm to identification of caGESPs stratified by specificity score. Size, weighted vote; color intensity, fractional vote.

Source data

Extended Data Fig. 5 Somatic copy number alterations of the GESPs across cancers.

a, The workflow of somatic copy number alteration analysis. Four criteria were used to identify the putative cancer-causing GESPs driven by SCNAs in each cancer type. b and c, Bubble plot shows the SCNA G-scores, which consider both the amplitudes of the aberrations and the frequencies of their occurrence across samples, of the putative cancer-causing GESPs driven by SCNAs in each cancer type. b, copy number gain; c, copy number loss. The size of the bubble: G-score; red: gain; blue: loss. Pubtator scores, which represent the number of publications for a given gene and were retrieved from Pubtator database, are shown next to G-score plot. Green: 1–150 (understudied genes); Red: >150. Target development levels of each gene, which were retrieved from PHAROS database, are shown in the left. Red: Tclin; blue: TChem; green: Tbio; grey: Tdark. Genes are ordered according to overall G-score (from largest values to smallest values). Top 100 GESPs with highest overall G-score were shown.

Extended Data Fig. 6 Somatic mutations of the GESPs across cancers.

a, The workflow of recurrent somatic mutation analysis. Five complementary methods were integrated to identify the putative cancer-causing GESPs driven by mutatons in each cancer type. b, The bubble plot shows the mutation frequencies and mutation indexes of the putative cancer-causing GESPs driven by somatic mutations in each cancer type. The size of the bubble: mutation frequency; intensity of color: mutation index. c, The bubble plot shows the frequencies of hotspot mutation of GESPs in each cancer type. The size of the bubble: hotspot mutation frequency. The locations of hotspot mutated regions were retrieved from cancerhotspots.org. Pubtator scores, which represent the number of publications for a given gene and were retrieved from Pubtator database, are shown the next to bubble plot. Green: 1–150 (understudied genes); Red: >150. Target development levels of each gene, which were retrieved from PHAROS database, are shown in the left. Red: Tclin; blue: TChem; green: Tbio; grey: Tdark. Genes are ordered according to overall M-score (from largest values to smallest values). Top 100 GESPs with highest overall M-score were shown.

Extended Data Fig. 7 Transcript fusions of the GESPs across cancers.

a, Summary of the GESP transcript fusion events across cancers. The size of the bubble: number of the GESP transcript fusion events across 33 cancer types. Pubtator scores, which represent the number of publications for a given gene and were retrieved from Pubtator database, are shown the next to bubble plot. Green: 1–150 (understudied genes); Red: >150. Target development levels of each gene, which were retrieved from PHAROS database, are shown in the left. Red: Tclin; blue: TChem; green: Tbio; grey: Tdark. Genes are ordered according to the overall number of the fusion events (from largest values to smallest values). Top 86 GESPs with highest number of fusion events (≥12) were shown. b, Summary of the GESP transcript fusion pairs across cancers. The size of the bubble: number of the GESP transcript fusion pairs across 33 cancer types. Fusion pairs are ordered according to the overall recurrent pairs number (from largest values to smallest values). Top 86 GESP transcript fusion pairs were shown.

Extended Data Fig. 8 Characterization of dependencies of the GESPs in cancer cell growth.

a, Proportional doughnut graph showing the frequency of common essential and strongly selective genes among GESPs (outer layer) and non-GESPs (inner layer). b, Summary of enrichment for essential genes (common essential and strongly selective) in the corresponding subgroups based on subcellular locations. Bar plot (left) and bubble plot (right) show the odds ratios on a log scale for each subgroup. Purple and green bars indicate that essential genes are enriched and depleted in the corresponding subgroups, respectively. The color intensity of the bars and bubbles indicates the enrichment significance calculated by two-sided Fisher’s exact test. c, Summary of association between mRNA expression levels and dependencies for essential GESPs. The x-axis represents the effect size of each gene. Positive effect size values represent higher dependency in cells expressing higher level of mRNA. The y-axis represents the negative logarithm (base 10) of the p-values from the Bioconductor Limma package. Benjamini-Hochberg (BH) method was used to adjust the p-values. Each circle corresponds to a GESP with size proportional to overall G-score (gain). Red circles represent GESPs whose dependencies are significantly and positively correlated with their mRNA expression levels. GESPs which are recurrently amplified in tumors and whose dependencies are significantly and positively correlated with both copy number and mRNA expression levels are outlined with black border. d, Association between mRNA expression levels and dependencies for EGFR (left) and ERBB2 (right). Red dots represent cells with gene copy number amplification. Density plots of gene expression and gene dependencies are stratified by gene amplification status.

Source data

Extended Data Fig. 9 Characterization of membrane-bound immunological accessory molecules (mIAMs) in cancers.

a, Mosaic plot shows the classification of mIAMs based on their expression patterns across cancer cell lines from non-hematological malignancies. b, Circos plot shows the number of mIAMs-associated interactions between cell types across 13 cancer types. Paired cell types with significant cell-cell interactions identified by CellPhoneDB were connected by lines. The width of the lines indicates normalized number of mIAMs-associated interactions between two cell types.

Source data

Supplementary information

Reporting Summary

Supplementary Table 1

Supplementary Tables 1–40.

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Fig. 6

Statistical source data.

Source Data Fig. 7

Statistical source data.

Source Data Fig. 8

Statistical source data.

Source Data Extended Data Fig. 1

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 8

Statistical source data.

Source Data Extended Data Fig. 9

Statistical source data.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, Z., Yuan, J., Long, M. et al. The Cancer Surfaceome Atlas integrates genomic, functional and drug response data to identify actionable targets. Nat Cancer 2, 1406–1422 (2021). https://doi.org/10.1038/s43018-021-00282-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s43018-021-00282-w

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer