Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Resource
  • Published:

A protein expression atlas on tissue samples and cell lines from cancer patients provides insights into tumor heterogeneity and dependencies

Abstract

The Cancer Genome Atlas (TCGA) and the Cancer Cell Line Encyclopedia (CCLE) are foundational resources in cancer research, providing extensive molecular and phenotypic data. However, large-scale proteomic data across various cancer types for these cohorts remain limited. Here, we expand upon our previous work to generate high-quality protein expression data for approximately 8,000 TCGA patient samples and around 900 CCLE cell line samples, covering 447 clinically relevant proteins, using reverse-phase protein arrays. These protein expression profiles offer profound insights into intertumor heterogeneity and cancer dependency and serve as sensitive functional readouts for somatic alterations. We develop a systematic protein-centered strategy for identifying synthetic lethality pairs and experimentally validate an interaction between protein kinase A subunit α and epidermal growth factor receptor. We also identify metastasis-related protein markers with clinical relevance. This dataset represents a valuable resource for advancing our understanding of cancer mechanisms, discovering protein biomarkers and developing innovative therapeutic strategies.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the upgraded RPPA resource of TCGA and CCLE samples.
Fig. 2: Global patterns of RPPA protein expression in different TCGA cancer types.
Fig. 3: Global patterns of RPPA protein expression in different CCLE cancer lineages.
Fig. 4: Effects of BRAF mutations on RPPA-based protein signaling.
Fig. 5: Systematic identification of synthetic lethality based on RPPA500 data of CCLE and TCGA samples.
Fig. 6: Characterization of tumor metastasis potential based on RPPA protein expression.

Similar content being viewed by others

Data availability

The RPPA dataset generated in this study is accessible through TCPA data portal (https://tcpaportal.org). This portal includes one subplatform for TCGA patient tumor samples and another for CCLE cell lines. The ‘dataset summary’ module provides detailed information about the number of samples for each type of cancer or cell line lineage (related to Fig. 1). In TCGA patient subplatform, several analysis modules are available, including protein–protein correlation analysis, differential analysis and survival analysis (related to Fig. 2). Data related to the CCLE cell lines are hosted at the MD Anderson Cell Lines Project, a subplatform under TCPA. The analyses include protein–protein correlation analysis, protein–drug correlation analysis, protein–mutation correlation analysis and protein–dependency correlation analysis (related to Figs. 35). Furthermore, the comprehensive annotation for each antibody is available in the ‘my protein’ module on both subplatforms. Each entry in this module corresponds to a protein marker, showing relevant gene information, as well as the validation status of the antibody and its origin, source, catalog number and RRID.

We obtained CCLE-related data from DepMap (https://depmap.org/portal/), including the genomic (mutations, copy number and DNA methylation), transcriptomic (RNA-seq and microRNA), MS, drug sensitivity, gene dependency and metabolomics data. Additional drug sensitivity data were downloaded from GDSC (https://www.cancerrxgene.org), PRISM (https://depmap.org/repurposing/) and GDSC drug combinations (https://gdsc-combinations.depmap.sanger.ac.uk). The metastatic potential data were downloaded from MetMap (https://depmap.org/metmap/). For TCGA samples, we downloaded molecular, tumor purity and clinical data from TCGA PanCanAtlas (https://gdc.cancer.gov/about- data/publications/pancanatlas). The annotations of hallmark gene sets were downloaded from Gene Set Enrichment Analysis (http://www.gsea-msigdb.org).

All other data supporting the findings of this study are available from the corresponding author on reasonable request. Source data are provided with this paper.

Code availability

All the software tools used for analysis in this study are accessible in public repositories. We used R to process the data and perform the computational analysis. SuperCurve can be found at https://bioinformatics.mdanderson.org/public-software/supercurve/. Cytoscape is available at https://cytoscape.org. ComplexHeatmap69 and ConsensusClusterPlus67 are R packages available on Bioconductor. We used BioRender (https://www.biorender.com) to generate the schematic diagrams and ggplot2 (ref. 77) to generate the data analysis plots. No custom code was generated in the course of this analysis.

References

  1. Ding, L. et al. Perspective on oncogenic processes at the end of the beginning of cancer genomics. Cell 173, 305–320 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Hutter, C. & Zenklusen, J. C. The Cancer Genome Atlas: creating lasting value beyond its data. Cell 173, 283–285 (2018).

    Article  CAS  PubMed  Google Scholar 

  3. Ghandi, M. et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Li, H. et al. The landscape of cancer cell line metabolism. Nat. Med. 25, 850–860 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Behan, F. M. et al. Prioritization of cancer therapeutic targets using CRISPR–Cas9 screens. Nature 568, 511–516 (2019).

    Article  CAS  PubMed  Google Scholar 

  7. Dempster, J. M. et al. Agreement between two large pan-cancer CRISPR–Cas9 gene dependency data sets. Nat. Commun. 10, 5817 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Garnett, M. J. et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483, 570–575 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Meyers, R. M. et al. Computational correction of copy number effect improves specificity of CRISPR–Cas9 essentiality screens in cancer cells. Nat. Genet. 49, 1779–1784 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Rodriguez, H., Zenklusen, J. C., Staudt, L. M., Doroshow, J. H. & Lowy, D. R. The next horizon in precision oncology: proteogenomics to inform cancer diagnosis and treatment. Cell 184, 1661–1670 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Akbani, R. et al. Realizing the promise of reverse phase protein arrays for clinical, translational, and basic research: a workshop report: the RPPA (Reverse Phase Protein Array) society. Mol. Cell. Proteomics 13, 1625–1643 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Akbani, R. et al. A pan-cancer proteomic perspective on The Cancer Genome Atlas. Nat. Commun. 5, 3887 (2014).

    Article  CAS  PubMed  Google Scholar 

  13. Li, J. et al. Characterization of human cancer cell lines by reverse-phase protein arrays. Cancer Cell 31, 225–239 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Zhao, W. et al. Large-scale characterization of drug responses of clinically relevant proteins in cancer cell lines. Cancer Cell 38, 829–843 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Berger, A. C. et al. A comprehensive pan-cancer molecular study of gynecologic and breast cancers. Cancer Cell 33, 690–705 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Fang, Y. et al. Sequential therapy with PARP and WEE1 inhibitors minimizes toxicity while maintaining efficacy. Cancer Cell 35, 851–867 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Zhang, Y. et al. A pan-cancer proteogenomic atlas of PI3K/AKT/mTOR pathway alterations. Cancer Cell 31, 820–832 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Li, J. et al. Explore, visualize, and analyze functional cancer proteomic data using The Cancer Proteome Atlas. Cancer Res. 77, e51–e54 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Li, J. et al. TCPA: a resource for cancer functional proteomics data. Nat. Methods 10, 1046–1047 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Siwak, D. R., Li, J., Akbani, R., Liang, H. & Lu, Y. Analytical platforms 3: processing samples via the RPPA pipeline to generate large-scale data for clinical studies. Adv. Exp. Med. Biol. 1188, 113–147 (2019).

    Article  CAS  PubMed  Google Scholar 

  22. Nusinow, D. P. et al. Quantitative proteomics of the Cancer Cell Line Encyclopedia. Cell 180, 387–402 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Basu, A. et al. An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules. Cell 154, 1151–1161 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Corsello, S. M. et al. Discovering the anti-cancer potential of non-oncology drugs by systematic viability profiling. Nat. Cancer 1, 235–248 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Iorio, F. et al. A landscape of pharmacogenomic interactions in cancer. Cell 166, 740–754 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Jin, X. et al. A metastasis map of human cancer cell lines. Nature 588, 331–336 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Chen, M. M. et al. TCPA v3.0: an integrative platform to explore the pan-cancer analysis of functional proteomic data. Mol. Cell. Proteomics 18, S15–S25 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Thorsson, V. et al. The immune landscape of cancer. Immunity 48, 812–830 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Frejno, M. et al. Proteome activity landscapes of tumor cell lines determine drug responses. Nat. Commun. 11, 3639 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Hoadley, K. A. et al. Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer. Cell 173, 291–304 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).

    Article  CAS  PubMed  Google Scholar 

  32. Tsherniak, A. et al. Defining a cancer dependency map. Cell 170, 564–576 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Dempster, J. M. et al. Chronos: a cell population dynamics model of CRISPR experiments that improves inference of gene fitness effects. Genome Biol. 22, 343 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Pacini, C. et al. Integrated cross-study datasets of genetic dependencies in cancer. Nat. Commun. 12, 1661 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Quintas-Cardama, A. & Cortes, J. Molecular biology of BCR–ABL1-positive chronic myeloid leukemia. Blood 113, 1619–1630 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Pleasance, E. D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196 (2010).

    Article  CAS  PubMed  Google Scholar 

  37. Chen, H. et al. Comprehensive assessment of computational algorithms in predicting cancer driver mutations. Genome Biol 21, 43 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Ng, P. K. et al. Systematic functional annotation of somatic mutations in cancer. Cancer Cell 33, 450–462 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Davies, H. et al. Mutations of the BRAF gene in human cancer. Nature 417, 949–954 (2002).

    Article  CAS  PubMed  Google Scholar 

  40. Menzer, C. et al. Targeted therapy in advanced melanoma with rare BRAF mutations. J. Clin. Oncol. 37, 3142–3151 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Yaeger, R. & Corcoran, R. B. Targeting alterations in the Raf–MEK pathway. Cancer Discov. 9, 329–341 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Lavoie, H., Gagnon, J. & Therrien, M. ERK signalling: a master regulator of cell behaviour, life and fate. Nat. Rev. Mol. Cell Biol. 21, 607–632 (2020).

    Article  CAS  PubMed  Google Scholar 

  43. Yao, Z. et al. BRAF mutants evade ERK-dependent feedback by different mechanisms that determine their sensitivity to pharmacologic inhibition. Cancer Cell 28, 370–383 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Negrao, M. V. et al. Molecular landscape of BRAF-mutant NSCLC reveals an association between clonality and driver mutations and identifies targetable non-V600 driver mutations. J. Thorac. Oncol. 15, 1611–1623 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Chen, S. H. et al. Oncogenic BRAF deletions that function as homodimers and are sensitive to inhibition by Raf dimer inhibitor LY3009120. Cancer Discov. 6, 300–315 (2016).

    Article  CAS  PubMed  Google Scholar 

  46. Eisenhardt, A. E. et al. Functional characterization of a BRAF insertion mutant associated with pilocytic astrocytoma. Int. J. Cancer 129, 2297–2303 (2011).

    Article  CAS  PubMed  Google Scholar 

  47. O’Neil, N. J., Bailey, M. L. & Hieter, P. Synthetic lethality and cancer. Nat. Rev. Genet. 18, 613–623 (2017).

    Article  PubMed  Google Scholar 

  48. Lee, J. S. et al. Synthetic lethality-mediated precision oncology via the tumor transcriptome. Cell 184, 2487–2502 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Jaaks, P. et al. Effective drug combinations in breast, colon and pancreatic cancer cells. Nature 603, 166–173 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Abourehab, M. A. S., Alqahtani, A. M., Youssif, B. G. M. & Gouda, A. M.Globally approved EGFR inhibitors: insights into their syntheses, target kinases, biological activities, receptor interactions, and metabolism. Molecules 26, 6677 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Sakamoto, K. M. & Frank, D. A. CREB in the pathophysiology of cancer: implications for targeting transcription factors for cancer therapy. Clin. Cancer Res. 15, 2583–2587 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Riccio, A., Ahn, S., Davenport, C. M., Blendy, J. A. & Ginty, D. D. Mediation by a CREB family transcription factor of NGF-dependent survival of sympathetic neurons. Science 286, 2358–2361 (1999).

    Article  CAS  PubMed  Google Scholar 

  53. Srinivasan, S. et al. Tobacco carcinogen-induced production of GM-CSF activates CREB to promote pancreatic cancer. Cancer Res. 78, 6146–6158 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Qin, Y. et al. Interfering MSN–NONO complex-activated CREB signaling serves as a therapeutic strategy for triple-negative breast cancer. Sci. Adv. 6, eaaw9960 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Fares, J., Fares, M. Y., Khachfe, H. H., Salhab, H. A. & Fares, Y. Molecular principles of metastasis: a hallmark of cancer revisited. Signal Transduct. Target. Ther. 5, 28 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Khan, I. & Steeg, P. S. Metastasis suppressors: functional pathways. Lab. Invest. 98, 198–210 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  57. Nguyen, D. T. et al. Pharos: collating protein information to shed light on the druggable genome. Nucleic Acids Res. 45, D995–D1002 (2017).

    Article  CAS  PubMed  Google Scholar 

  58. Dou, Y. et al. Proteogenomic characterization of endometrial carcinoma. Cell 180, 729–748 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Gillette, M. A. et al. Proteogenomic characterization reveals therapeutic vulnerabilities in lung adenocarcinoma. Cell 182, 200–225 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Vasaikar, S. et al. Proteogenomic analysis of human colon cancer reveals new therapeutic opportunities. Cell 177, 1035–1049 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Tibes, R. et al. Reverse phase protein array: validation of a novel proteomic technology and utility for analysis of primary leukemia specimens and hematopoietic stem cells. Mol. Cancer Ther. 5, 2512–2521 (2006).

    Article  CAS  PubMed  Google Scholar 

  62. Hennessy, B. T. et al. A technical assessment of the utility of reverse phase protein arrays for the study of the functional proteome in non-microdissected human breast cancers. Clin. Proteomics 6, 129–151 (2010).

    Article  CAS  PubMed  Google Scholar 

  63. Hu, J. et al. Non-parametric quantification of protein lysate arrays. Bioinformatics 23, 1986–1994 (2007).

    Article  CAS  PubMed  Google Scholar 

  64. Neeley, E. S., Baggerly, K. A. & Kornblau, S. M. Surface adjustment of reverse phase protein arrays using positive control spots. Cancer Inform. 11, 77–86 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  65. Ju, Z. et al. Development of a robust classifier for quality control of reverse-phase protein arrays. Bioinformatics 31, 912–918 (2015).

    Article  CAS  PubMed  Google Scholar 

  66. Gonzalez-Angulo, A. M. et al. Functional proteomics can define prognosis and predict pathologic complete response in patients with breast cancer. Clin. Proteomics 8, 11 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  67. Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Gu, Z., Gu, L., Eils, R., Schlesner, M. & Brors, B. circlize implements and enhances circular visualization in R. Bioinformatics 30, 2811–2812 (2014).

    Article  CAS  PubMed  Google Scholar 

  69. Gu, Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849 (2016).

    Article  CAS  PubMed  Google Scholar 

  70. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Sanchez-Vega, F. et al. Oncogenic signaling pathways in The Cancer Genome Atlas. Cell 173, 321–337 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Dogruluk, T. et al. Identification of variant-specific functions of PIK3CA by rapid phenotyping of rare mutations. Cancer Res. 75, 5341–5354 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Tsang, Y. H. et al. Functional annotation of rare gene aberration drivers of pancreatic cancer. Nat. Commun. 7, 10500 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Cheung, L. W. et al. Naturally occurring neomorphic PIK3R1 mutations activate the MAPK pathway, dictating therapeutic response to MAPK pathway inhibitors. Cancer Cell 26, 479–494 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Liang, H. et al. Whole-exome sequencing combined with functional genomics reveals novel candidate driver cancer genes in endometrial cancer. Genome Res. 22, 2120–2129 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 9, 671–675 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Wickham, H. ggplot2: Elegant Graphics for Data Analysis 2nd edn (Springer, 2016).

Download references

Acknowledgements

This study was supported by the NIH (U01CA217842, U01CA281902 and U01CA253472 to H.L. and G.B.M., U24CA264128 to H.L., G.B.M. and R.A., R01CA251150 and P50CA281701 to H.L., P50CA221703 to H.L. and M.A.D., U24CA210950, U24CA210949 and U24CA264006 to R.A. and G.B.M., R50CA221675 to Y.L., P50CA270907 to J.L. and the Cancer Center Support Grant P30CA016672), a kind gift from the Sheldon and Miriam Adelson Medical Research Foundation, Susan G. Komen (SAC110052), the Ovarian Cancer Research Foundation (545152) and the Breast Cancer Research Foundation (BCRF-18-110) to G.B.M., a Department of Defense Congressionally Directed Medical Research Program award (W81XWH-16-1-0237) to R.A., grants from the Cancer Prevention and Research Institute of Texas (RP210042 and RP160015) to S.V.K. and the University Cancer Foundation through the Institutional Research Grant program at the University of Texas MD Anderson Cancer Center to J.L.

Author information

Authors and Affiliations

Authors

Contributions

R.A., G.B.M. and H.L. conceptualized the project. J.L., W.L., Z.J., S.V.K., P.K.-S.N., H.C., M.A.D., Y.L., R.A., G.B.M. and H.L. contributed to the data analysis and result discussion. K.M., H.K., Z.Z., P.K.-S.N. and Y.L. contributed to the experiments. J.L. and H.L. wrote the manuscript with input from other authors. H.L. supervised the project.

Corresponding authors

Correspondence to Rehan Akbani, Gordon B. Mills or Han Liang.

Ethics declarations

Competing interests

M.A.D. has been a consultant to Roche/Genentech, Array, Pfizer, Novartis, BMS, GSK, Sanofi-Aventis, Vaccinex, Apexigen, Eisai, Iovance and ABM Therapeutics and he has been the PI of research grants to MD Anderson from Roche/Genentech, GSK, Sanofi-Aventis, Merck, Myriad, Oncothyreon and ABM Therapeutics. R.A. is a bioinformatics consultant for the University of Houston. G.B.M. is a scientific advisory board member or consultant for AstraZeneca, Chrysalis Biotechnology, GSK, ImmunoMET, Ionis, Lilly, PDX Pharmaceuticals, Signalchem Lifesciences, Symphogen, Tarveda, Turbine and Zentalis Pharmaceuticals, has stocks, options or financial considerations with Catena Pharmaceuticals, ImmunoMet, SignalChem and Tarveda and has licensed the HRD assay to Myriad Genetics and DSP patents to Nanostring. H.L. is a shareholder and advisor for Precision Scientific. The other authors declare no competing interests.

Peer review

Peer review information

Nature Cancer thanks Bing Zhang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Quality control of RPPA data.

(a) Summary of antibody selection and validation process. (b) Overview of RPPA data normalization, with rigorous QC parameters or controls at each step. (c) Comparison of protein markers in Phase I and Phase I & II. (d) Boxplots showing differential correlations of RPPA-based IFN-gamma response pathway score and that reported by literature in Phase I and Phase I & II. N = 32 represents the number of cancer types. (e) Comparison of correlations between RPPA and MS/RNA-seq in Phase I (N = 204) and Phase II (N = 243). (c-e) P-values are based on paired Wilcoxon tests. N represents the number of protein markers. (f) The distributions of the lineage-specific expression correlations of RPPA-based proteins with RNA-seq-based mRNA (top panel), MS-based total proteins (middle panel), and correlations between MS-based total proteins and RNA-seq-based mRNA (bottom panel). The distributions are shown for total proteins and PTM proteins, respectively. (g) A scatter plot showing the correlation between the sample sizes and the mean correlations across different lineages. N = 49 represents the number of mean correlations. (h) The distribution of expression correlations between RPPA-based and MS-based phosphorylated proteins in NCI60 cell lines. (i) A scatter plot showing a representative example of the phosphoprotein, HSP27_pS82, between the RPPA and the MS data. N = 38 represents the number of cell line samples. (c-e) The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively. (g, i) Shaded areas denote the 95% confidence intervals. The p-values are based on Spearman’s correlation coefficient test.

Source data

Extended Data Fig. 2 Pan-cancer analysis of TCGA samples.

(a) Forest plots of hazard ratios for cluster RPPA_K7 and clinical variables. The p-values are based on a multivariate Cox proportional hazards model. The center point of each horizontal bar represents the estimated hazard ratio. The p-values are based on a multivariate Cox proportional hazards model. (b) Volcano plots showing differentially expressed protein markers between the corresponding cluster pairs identified from the patient survival analysis. The p-values are based on the Wilcoxon test. N represents the number of significant positive or negative protein markers (KIRC: N = 131, and 136 for significantly positive and negative protein markers respectively; CESC: N = 143, and 192 for significantly positive and negative protein markers respectively). (c, d) Significantly up and downregulated pathways, EMT (c) and IFN-α (d), identified by pathway analysis for K7 (N = 16) vs. K10 (N = 237) in KIRC and K7 (N = 60) vs. K1 (N = 100) in CESC. The p-values are based on the Wilcoxon test. N represents the number of patient samples in each group. (e, f) Associations of RPPA clusters with driver mutations (e) and copy number alterations (f). Co-occurrence and mutual exclusivity are shown in different colors. The p-values are based on the Chi-squared test. Significant hits with adjustments for cancer types are shaded. N represents the number of significant hits. (g, h) Boxplots showing differential patterns of protein-mRNA coupling across TCGA patient cohorts by copy number alteration of cis-cancer driver genes (copy number amplification: N = 59; and copy number deletion: N = 24). The p-values are based on the one-sided paired Wilcoxon test. N represents the number of protein-alteration pairs. (c, d, g, h) The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively.

Source data

Extended Data Fig. 3 Pan-cancer analysis of CCLE samples.

(a, b) Sample distributions of different cell lineages (a) and cancer types (b) in RPPA-based clusters. (c, d) Box plots of MELANA (c) and SGK1 (d) in skin cancer (N = 49) and other cell lines (N = 829). (e) Box plot of BCL2 expression in luminal (N = 11) and other subtypes (N = 37) of breast cancer. (f) Box plot of expression of the proteins involved in the cell cycle in fibroblasts (N = 17) and other cell lines (N = 861). (c-f) N represents the number of cell line samples. (g) Box plots showing the comparison of mRNA-based |log2(FC)| vs. protein-based |log2(FC)| between lineages for the lineage-specific markers. N = 2,312 represents the number of protein-lineage pairs. (c-g) The p-values are based on the two-sided Wilcoxon test. (h) Box plots showing differential fold changes of TP53 mRNA and protein expression between TP53 mutant and wild-type TCGA tumor samples. The p-value is based on the paired Wilcoxon test. N = 30 represents the number of cancer types. (i) Effect of mutations in different P53 domains on P53 protein expression. The p-values are based on Wilcoxon test. (j) Box plots showing differential TP53 mRNA (left) and P53 protein (right) expression in TP53 wild-type samples (N = 159) and those harboring mutations in P53 DNA binding domain (DBD domain; N = 267). (k) Box plots showing differential RPPA expression of P53 between TP53 nonsense mutant (N = 40) and wild-type samples (N = 159). (l) Box plots showing differential gene dependency of TP53 between TP53 mutant (N = 364) and wild-type samples (N = 159). (i-l) N represents the number of cell line samples in each group. (m) A bar plot showing the correlation of gene dependency with PKAA protein and mRNA expression in different cell line lineages. Significant correlations (FDR < 0.1) are marked by an asterisk. (n) A bar plot summarizing the number of lineages in which each protein marker is significantly associated with their gene dependency but not with mRNA (FDR < 0.1). N represents the number of protein markers (phase I: N = 21; phase II: N = 25). (c-l) The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively.

Source data

Extended Data Fig. 4 Differential effects of BRAF mutations on MEK1/2 mRNA and protein expression.

(a) A scatter plot showing the correlation between MEK1/2_pS217S221 RPPA and BRAF gene dependency in melanoma cell lines (N = 36). Shaded areas denote the 95% confidence intervals. (b-c) Box plots showing differential expression of MEK1 mRNA (b), MEK2 mRNA (c) based on the functional effects of BRAF mutations characterized by the cell viability assays. (a-c) N represents the number of cell line samples. (d) Box plots showing the differential expression of MEK1/2_pS217S221 RPPA between the patient samples with an activating BRAF mutation (N = 119) and those with wild-type BRAF (N = 60). (b-d) The p-values are based on Wilcoxon test. N represents the number of patient samples. (e) A heatmap showing detailed information on the BRAF mutations, their functional effects, and the corresponding MEK1/2_pS217S221 RPPA expression pattern. (b-d) The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively.

Source data

Extended Data Fig. 5 In vitro validation of PKAA-EGFR synthetic lethal interaction.

(a) A Kaplan-Meier plot showing distinct survival probabilities of bladder cancer patients with both low PKAA and EGFR protein levels. The p-value is based on a log-rank test. Shaded areas denote the 95% confidence intervals. N represents the number of patients in each group (both low: N = 77; and others: N = 257). (b) A forest plot of hazard ratios for PKAA protein and clinical variables. The p-values are based on a multivariate Cox proportional hazards model. The center point of each horizontal bar represents the estimated hazard ratio. N represents the number of patients. (c) Box plots showing the differential gene dependency of PKAA between samples with EGFR deletion (N = 125) and amplification (N = 463). The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively. N represents the number of cell line samples. (d) Enrichment of EGFR inhibitors in drugs resistant to high PKAA levels. The p-value is based on Fisher’s exact test. (e, f) Relative mRNA level of PKAA in PKAA-KD A549 or H226 cells. N = 2 independent replicates were examined for each condition. (g-l) Drug response assays at 72 h for A549 (g-i) or H226 (j-l) PKAA-KD and control cells treated with three EGFR inhibitors, Afatinib, Erlotinib, and Osimertinib (DMSO and 9 drug concentrations). N = 3 independent replicates were examined for each treatment and perturbation. Data are shown as mean ± SEM. The p-values are based on ANOVA.

Source data

Extended Data Fig. 6 Evaluation of metastasis markers in cancer cell lines and patient samples.

(a) Box plots showing the log2(FC) between metastatic and primary cell lines of anti- (N = 37) and pro-metastasis (N = 35) marker RPPA expression. The p-value is based on the Wilcoxon test. The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively. N represents the number of protein markers in each group. (b, c) Relative mRNA level of CDK9 in CDK9-KD and control MD-MB-231 or A549 cells. N = 3 independent replicates were examined for each perturbation. Data are shown as mean ± SEM. (d, e) ROC curves showing predictive powers of GCLM expression in ACC patients (N = 44) (d) and CHK1 in SARC patients (N = 223) (e) between metastatic and non-metastatic primary tumor samples. N represents the number of patients. (f) A pie chart showing the distribution of drug development levels for all the identified pro-metastasis protein markers. The annotation data was obtained from the Pharos database. N represents the number of protein markers in each group (Tclin: N = 6; Tchem: N = 13; and Tbio: N = 15).

Source data

Supplementary information

Reporting Summary

Supplementary Tables

Supplementary Table 1: Information on TCGA samples profiled in this study. Supplementary Table 2: Information on CCLE samples profiled in this study. Supplementary Table 3: Information on RPPA500 protein markers and corresponding antibodies. Supplementary Table 4: Information on hallmark gene sets related protein markers. Supplementary Table 5: Predicted synthetic lethal protein pairs.

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Fig. 6

Statistical source data.

Source Data Extended Data Fig. 1

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 6

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, J., Liu, W., Mojumdar, K. et al. A protein expression atlas on tissue samples and cell lines from cancer patients provides insights into tumor heterogeneity and dependencies. Nat Cancer (2024). https://doi.org/10.1038/s43018-024-00817-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s43018-024-00817-x

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer