Abstract
The Cancer Genome Atlas (TCGA) and the Cancer Cell Line Encyclopedia (CCLE) are foundational resources in cancer research, providing extensive molecular and phenotypic data. However, large-scale proteomic data across various cancer types for these cohorts remain limited. Here, we expand upon our previous work to generate high-quality protein expression data for approximately 8,000 TCGA patient samples and around 900 CCLE cell line samples, covering 447 clinically relevant proteins, using reverse-phase protein arrays. These protein expression profiles offer profound insights into intertumor heterogeneity and cancer dependency and serve as sensitive functional readouts for somatic alterations. We develop a systematic protein-centered strategy for identifying synthetic lethality pairs and experimentally validate an interaction between protein kinase A subunit α and epidermal growth factor receptor. We also identify metastasis-related protein markers with clinical relevance. This dataset represents a valuable resource for advancing our understanding of cancer mechanisms, discovering protein biomarkers and developing innovative therapeutic strategies.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The RPPA dataset generated in this study is accessible through TCPA data portal (https://tcpaportal.org). This portal includes one subplatform for TCGA patient tumor samples and another for CCLE cell lines. The ‘dataset summary’ module provides detailed information about the number of samples for each type of cancer or cell line lineage (related to Fig. 1). In TCGA patient subplatform, several analysis modules are available, including protein–protein correlation analysis, differential analysis and survival analysis (related to Fig. 2). Data related to the CCLE cell lines are hosted at the MD Anderson Cell Lines Project, a subplatform under TCPA. The analyses include protein–protein correlation analysis, protein–drug correlation analysis, protein–mutation correlation analysis and protein–dependency correlation analysis (related to Figs. 3–5). Furthermore, the comprehensive annotation for each antibody is available in the ‘my protein’ module on both subplatforms. Each entry in this module corresponds to a protein marker, showing relevant gene information, as well as the validation status of the antibody and its origin, source, catalog number and RRID.
We obtained CCLE-related data from DepMap (https://depmap.org/portal/), including the genomic (mutations, copy number and DNA methylation), transcriptomic (RNA-seq and microRNA), MS, drug sensitivity, gene dependency and metabolomics data. Additional drug sensitivity data were downloaded from GDSC (https://www.cancerrxgene.org), PRISM (https://depmap.org/repurposing/) and GDSC drug combinations (https://gdsc-combinations.depmap.sanger.ac.uk). The metastatic potential data were downloaded from MetMap (https://depmap.org/metmap/). For TCGA samples, we downloaded molecular, tumor purity and clinical data from TCGA PanCanAtlas (https://gdc.cancer.gov/about- data/publications/pancanatlas). The annotations of hallmark gene sets were downloaded from Gene Set Enrichment Analysis (http://www.gsea-msigdb.org).
All other data supporting the findings of this study are available from the corresponding author on reasonable request. Source data are provided with this paper.
Code availability
All the software tools used for analysis in this study are accessible in public repositories. We used R to process the data and perform the computational analysis. SuperCurve can be found at https://bioinformatics.mdanderson.org/public-software/supercurve/. Cytoscape is available at https://cytoscape.org. ComplexHeatmap69 and ConsensusClusterPlus67 are R packages available on Bioconductor. We used BioRender (https://www.biorender.com) to generate the schematic diagrams and ggplot2 (ref. 77) to generate the data analysis plots. No custom code was generated in the course of this analysis.
References
Ding, L. et al. Perspective on oncogenic processes at the end of the beginning of cancer genomics. Cell 173, 305–320 (2018).
Hutter, C. & Zenklusen, J. C. The Cancer Genome Atlas: creating lasting value beyond its data. Cell 173, 283–285 (2018).
Ghandi, M. et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).
Li, H. et al. The landscape of cancer cell line metabolism. Nat. Med. 25, 850–860 (2019).
Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).
Behan, F. M. et al. Prioritization of cancer therapeutic targets using CRISPR–Cas9 screens. Nature 568, 511–516 (2019).
Dempster, J. M. et al. Agreement between two large pan-cancer CRISPR–Cas9 gene dependency data sets. Nat. Commun. 10, 5817 (2019).
Garnett, M. J. et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483, 570–575 (2012).
Meyers, R. M. et al. Computational correction of copy number effect improves specificity of CRISPR–Cas9 essentiality screens in cancer cells. Nat. Genet. 49, 1779–1784 (2017).
Rodriguez, H., Zenklusen, J. C., Staudt, L. M., Doroshow, J. H. & Lowy, D. R. The next horizon in precision oncology: proteogenomics to inform cancer diagnosis and treatment. Cell 184, 1661–1670 (2021).
Akbani, R. et al. Realizing the promise of reverse phase protein arrays for clinical, translational, and basic research: a workshop report: the RPPA (Reverse Phase Protein Array) society. Mol. Cell. Proteomics 13, 1625–1643 (2014).
Akbani, R. et al. A pan-cancer proteomic perspective on The Cancer Genome Atlas. Nat. Commun. 5, 3887 (2014).
Li, J. et al. Characterization of human cancer cell lines by reverse-phase protein arrays. Cancer Cell 31, 225–239 (2017).
Zhao, W. et al. Large-scale characterization of drug responses of clinically relevant proteins in cancer cell lines. Cancer Cell 38, 829–843 (2020).
Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385 (2018).
Berger, A. C. et al. A comprehensive pan-cancer molecular study of gynecologic and breast cancers. Cancer Cell 33, 690–705 (2018).
Fang, Y. et al. Sequential therapy with PARP and WEE1 inhibitors minimizes toxicity while maintaining efficacy. Cancer Cell 35, 851–867 (2019).
Zhang, Y. et al. A pan-cancer proteogenomic atlas of PI3K/AKT/mTOR pathway alterations. Cancer Cell 31, 820–832 (2017).
Li, J. et al. Explore, visualize, and analyze functional cancer proteomic data using The Cancer Proteome Atlas. Cancer Res. 77, e51–e54 (2017).
Li, J. et al. TCPA: a resource for cancer functional proteomics data. Nat. Methods 10, 1046–1047 (2013).
Siwak, D. R., Li, J., Akbani, R., Liang, H. & Lu, Y. Analytical platforms 3: processing samples via the RPPA pipeline to generate large-scale data for clinical studies. Adv. Exp. Med. Biol. 1188, 113–147 (2019).
Nusinow, D. P. et al. Quantitative proteomics of the Cancer Cell Line Encyclopedia. Cell 180, 387–402 (2020).
Basu, A. et al. An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules. Cell 154, 1151–1161 (2013).
Corsello, S. M. et al. Discovering the anti-cancer potential of non-oncology drugs by systematic viability profiling. Nat. Cancer 1, 235–248 (2020).
Iorio, F. et al. A landscape of pharmacogenomic interactions in cancer. Cell 166, 740–754 (2016).
Jin, X. et al. A metastasis map of human cancer cell lines. Nature 588, 331–336 (2020).
Chen, M. M. et al. TCPA v3.0: an integrative platform to explore the pan-cancer analysis of functional proteomic data. Mol. Cell. Proteomics 18, S15–S25 (2019).
Thorsson, V. et al. The immune landscape of cancer. Immunity 48, 812–830 (2018).
Frejno, M. et al. Proteome activity landscapes of tumor cell lines determine drug responses. Nat. Commun. 11, 3639 (2020).
Hoadley, K. A. et al. Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer. Cell 173, 291–304 (2018).
Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
Tsherniak, A. et al. Defining a cancer dependency map. Cell 170, 564–576 (2017).
Dempster, J. M. et al. Chronos: a cell population dynamics model of CRISPR experiments that improves inference of gene fitness effects. Genome Biol. 22, 343 (2021).
Pacini, C. et al. Integrated cross-study datasets of genetic dependencies in cancer. Nat. Commun. 12, 1661 (2021).
Quintas-Cardama, A. & Cortes, J. Molecular biology of BCR–ABL1-positive chronic myeloid leukemia. Blood 113, 1619–1630 (2009).
Pleasance, E. D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196 (2010).
Chen, H. et al. Comprehensive assessment of computational algorithms in predicting cancer driver mutations. Genome Biol 21, 43 (2020).
Ng, P. K. et al. Systematic functional annotation of somatic mutations in cancer. Cancer Cell 33, 450–462 (2018).
Davies, H. et al. Mutations of the BRAF gene in human cancer. Nature 417, 949–954 (2002).
Menzer, C. et al. Targeted therapy in advanced melanoma with rare BRAF mutations. J. Clin. Oncol. 37, 3142–3151 (2019).
Yaeger, R. & Corcoran, R. B. Targeting alterations in the Raf–MEK pathway. Cancer Discov. 9, 329–341 (2019).
Lavoie, H., Gagnon, J. & Therrien, M. ERK signalling: a master regulator of cell behaviour, life and fate. Nat. Rev. Mol. Cell Biol. 21, 607–632 (2020).
Yao, Z. et al. BRAF mutants evade ERK-dependent feedback by different mechanisms that determine their sensitivity to pharmacologic inhibition. Cancer Cell 28, 370–383 (2015).
Negrao, M. V. et al. Molecular landscape of BRAF-mutant NSCLC reveals an association between clonality and driver mutations and identifies targetable non-V600 driver mutations. J. Thorac. Oncol. 15, 1611–1623 (2020).
Chen, S. H. et al. Oncogenic BRAF deletions that function as homodimers and are sensitive to inhibition by Raf dimer inhibitor LY3009120. Cancer Discov. 6, 300–315 (2016).
Eisenhardt, A. E. et al. Functional characterization of a BRAF insertion mutant associated with pilocytic astrocytoma. Int. J. Cancer 129, 2297–2303 (2011).
O’Neil, N. J., Bailey, M. L. & Hieter, P. Synthetic lethality and cancer. Nat. Rev. Genet. 18, 613–623 (2017).
Lee, J. S. et al. Synthetic lethality-mediated precision oncology via the tumor transcriptome. Cell 184, 2487–2502 (2021).
Jaaks, P. et al. Effective drug combinations in breast, colon and pancreatic cancer cells. Nature 603, 166–173 (2022).
Abourehab, M. A. S., Alqahtani, A. M., Youssif, B. G. M. & Gouda, A. M.Globally approved EGFR inhibitors: insights into their syntheses, target kinases, biological activities, receptor interactions, and metabolism. Molecules 26, 6677 (2021).
Sakamoto, K. M. & Frank, D. A. CREB in the pathophysiology of cancer: implications for targeting transcription factors for cancer therapy. Clin. Cancer Res. 15, 2583–2587 (2009).
Riccio, A., Ahn, S., Davenport, C. M., Blendy, J. A. & Ginty, D. D. Mediation by a CREB family transcription factor of NGF-dependent survival of sympathetic neurons. Science 286, 2358–2361 (1999).
Srinivasan, S. et al. Tobacco carcinogen-induced production of GM-CSF activates CREB to promote pancreatic cancer. Cancer Res. 78, 6146–6158 (2018).
Qin, Y. et al. Interfering MSN–NONO complex-activated CREB signaling serves as a therapeutic strategy for triple-negative breast cancer. Sci. Adv. 6, eaaw9960 (2020).
Fares, J., Fares, M. Y., Khachfe, H. H., Salhab, H. A. & Fares, Y. Molecular principles of metastasis: a hallmark of cancer revisited. Signal Transduct. Target. Ther. 5, 28 (2020).
Khan, I. & Steeg, P. S. Metastasis suppressors: functional pathways. Lab. Invest. 98, 198–210 (2017).
Nguyen, D. T. et al. Pharos: collating protein information to shed light on the druggable genome. Nucleic Acids Res. 45, D995–D1002 (2017).
Dou, Y. et al. Proteogenomic characterization of endometrial carcinoma. Cell 180, 729–748 (2020).
Gillette, M. A. et al. Proteogenomic characterization reveals therapeutic vulnerabilities in lung adenocarcinoma. Cell 182, 200–225 (2020).
Vasaikar, S. et al. Proteogenomic analysis of human colon cancer reveals new therapeutic opportunities. Cell 177, 1035–1049 (2019).
Tibes, R. et al. Reverse phase protein array: validation of a novel proteomic technology and utility for analysis of primary leukemia specimens and hematopoietic stem cells. Mol. Cancer Ther. 5, 2512–2521 (2006).
Hennessy, B. T. et al. A technical assessment of the utility of reverse phase protein arrays for the study of the functional proteome in non-microdissected human breast cancers. Clin. Proteomics 6, 129–151 (2010).
Hu, J. et al. Non-parametric quantification of protein lysate arrays. Bioinformatics 23, 1986–1994 (2007).
Neeley, E. S., Baggerly, K. A. & Kornblau, S. M. Surface adjustment of reverse phase protein arrays using positive control spots. Cancer Inform. 11, 77–86 (2012).
Ju, Z. et al. Development of a robust classifier for quality control of reverse-phase protein arrays. Bioinformatics 31, 912–918 (2015).
Gonzalez-Angulo, A. M. et al. Functional proteomics can define prognosis and predict pathologic complete response in patients with breast cancer. Clin. Proteomics 8, 11 (2011).
Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010).
Gu, Z., Gu, L., Eils, R., Schlesner, M. & Brors, B. circlize implements and enhances circular visualization in R. Bioinformatics 30, 2811–2812 (2014).
Gu, Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849 (2016).
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Sanchez-Vega, F. et al. Oncogenic signaling pathways in The Cancer Genome Atlas. Cell 173, 321–337 (2018).
Dogruluk, T. et al. Identification of variant-specific functions of PIK3CA by rapid phenotyping of rare mutations. Cancer Res. 75, 5341–5354 (2015).
Tsang, Y. H. et al. Functional annotation of rare gene aberration drivers of pancreatic cancer. Nat. Commun. 7, 10500 (2016).
Cheung, L. W. et al. Naturally occurring neomorphic PIK3R1 mutations activate the MAPK pathway, dictating therapeutic response to MAPK pathway inhibitors. Cancer Cell 26, 479–494 (2014).
Liang, H. et al. Whole-exome sequencing combined with functional genomics reveals novel candidate driver cancer genes in endometrial cancer. Genome Res. 22, 2120–2129 (2012).
Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 9, 671–675 (2012).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis 2nd edn (Springer, 2016).
Acknowledgements
This study was supported by the NIH (U01CA217842, U01CA281902 and U01CA253472 to H.L. and G.B.M., U24CA264128 to H.L., G.B.M. and R.A., R01CA251150 and P50CA281701 to H.L., P50CA221703 to H.L. and M.A.D., U24CA210950, U24CA210949 and U24CA264006 to R.A. and G.B.M., R50CA221675 to Y.L., P50CA270907 to J.L. and the Cancer Center Support Grant P30CA016672), a kind gift from the Sheldon and Miriam Adelson Medical Research Foundation, Susan G. Komen (SAC110052), the Ovarian Cancer Research Foundation (545152) and the Breast Cancer Research Foundation (BCRF-18-110) to G.B.M., a Department of Defense Congressionally Directed Medical Research Program award (W81XWH-16-1-0237) to R.A., grants from the Cancer Prevention and Research Institute of Texas (RP210042 and RP160015) to S.V.K. and the University Cancer Foundation through the Institutional Research Grant program at the University of Texas MD Anderson Cancer Center to J.L.
Author information
Authors and Affiliations
Contributions
R.A., G.B.M. and H.L. conceptualized the project. J.L., W.L., Z.J., S.V.K., P.K.-S.N., H.C., M.A.D., Y.L., R.A., G.B.M. and H.L. contributed to the data analysis and result discussion. K.M., H.K., Z.Z., P.K.-S.N. and Y.L. contributed to the experiments. J.L. and H.L. wrote the manuscript with input from other authors. H.L. supervised the project.
Corresponding authors
Ethics declarations
Competing interests
M.A.D. has been a consultant to Roche/Genentech, Array, Pfizer, Novartis, BMS, GSK, Sanofi-Aventis, Vaccinex, Apexigen, Eisai, Iovance and ABM Therapeutics and he has been the PI of research grants to MD Anderson from Roche/Genentech, GSK, Sanofi-Aventis, Merck, Myriad, Oncothyreon and ABM Therapeutics. R.A. is a bioinformatics consultant for the University of Houston. G.B.M. is a scientific advisory board member or consultant for AstraZeneca, Chrysalis Biotechnology, GSK, ImmunoMET, Ionis, Lilly, PDX Pharmaceuticals, Signalchem Lifesciences, Symphogen, Tarveda, Turbine and Zentalis Pharmaceuticals, has stocks, options or financial considerations with Catena Pharmaceuticals, ImmunoMet, SignalChem and Tarveda and has licensed the HRD assay to Myriad Genetics and DSP patents to Nanostring. H.L. is a shareholder and advisor for Precision Scientific. The other authors declare no competing interests.
Peer review
Peer review information
Nature Cancer thanks Bing Zhang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Quality control of RPPA data.
(a) Summary of antibody selection and validation process. (b) Overview of RPPA data normalization, with rigorous QC parameters or controls at each step. (c) Comparison of protein markers in Phase I and Phase I & II. (d) Boxplots showing differential correlations of RPPA-based IFN-gamma response pathway score and that reported by literature in Phase I and Phase I & II. N = 32 represents the number of cancer types. (e) Comparison of correlations between RPPA and MS/RNA-seq in Phase I (N = 204) and Phase II (N = 243). (c-e) P-values are based on paired Wilcoxon tests. N represents the number of protein markers. (f) The distributions of the lineage-specific expression correlations of RPPA-based proteins with RNA-seq-based mRNA (top panel), MS-based total proteins (middle panel), and correlations between MS-based total proteins and RNA-seq-based mRNA (bottom panel). The distributions are shown for total proteins and PTM proteins, respectively. (g) A scatter plot showing the correlation between the sample sizes and the mean correlations across different lineages. N = 49 represents the number of mean correlations. (h) The distribution of expression correlations between RPPA-based and MS-based phosphorylated proteins in NCI60 cell lines. (i) A scatter plot showing a representative example of the phosphoprotein, HSP27_pS82, between the RPPA and the MS data. N = 38 represents the number of cell line samples. (c-e) The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively. (g, i) Shaded areas denote the 95% confidence intervals. The p-values are based on Spearman’s correlation coefficient test.
Extended Data Fig. 2 Pan-cancer analysis of TCGA samples.
(a) Forest plots of hazard ratios for cluster RPPA_K7 and clinical variables. The p-values are based on a multivariate Cox proportional hazards model. The center point of each horizontal bar represents the estimated hazard ratio. The p-values are based on a multivariate Cox proportional hazards model. (b) Volcano plots showing differentially expressed protein markers between the corresponding cluster pairs identified from the patient survival analysis. The p-values are based on the Wilcoxon test. N represents the number of significant positive or negative protein markers (KIRC: N = 131, and 136 for significantly positive and negative protein markers respectively; CESC: N = 143, and 192 for significantly positive and negative protein markers respectively). (c, d) Significantly up and downregulated pathways, EMT (c) and IFN-α (d), identified by pathway analysis for K7 (N = 16) vs. K10 (N = 237) in KIRC and K7 (N = 60) vs. K1 (N = 100) in CESC. The p-values are based on the Wilcoxon test. N represents the number of patient samples in each group. (e, f) Associations of RPPA clusters with driver mutations (e) and copy number alterations (f). Co-occurrence and mutual exclusivity are shown in different colors. The p-values are based on the Chi-squared test. Significant hits with adjustments for cancer types are shaded. N represents the number of significant hits. (g, h) Boxplots showing differential patterns of protein-mRNA coupling across TCGA patient cohorts by copy number alteration of cis-cancer driver genes (copy number amplification: N = 59; and copy number deletion: N = 24). The p-values are based on the one-sided paired Wilcoxon test. N represents the number of protein-alteration pairs. (c, d, g, h) The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively.
Extended Data Fig. 3 Pan-cancer analysis of CCLE samples.
(a, b) Sample distributions of different cell lineages (a) and cancer types (b) in RPPA-based clusters. (c, d) Box plots of MELANA (c) and SGK1 (d) in skin cancer (N = 49) and other cell lines (N = 829). (e) Box plot of BCL2 expression in luminal (N = 11) and other subtypes (N = 37) of breast cancer. (f) Box plot of expression of the proteins involved in the cell cycle in fibroblasts (N = 17) and other cell lines (N = 861). (c-f) N represents the number of cell line samples. (g) Box plots showing the comparison of mRNA-based |log2(FC)| vs. protein-based |log2(FC)| between lineages for the lineage-specific markers. N = 2,312 represents the number of protein-lineage pairs. (c-g) The p-values are based on the two-sided Wilcoxon test. (h) Box plots showing differential fold changes of TP53 mRNA and protein expression between TP53 mutant and wild-type TCGA tumor samples. The p-value is based on the paired Wilcoxon test. N = 30 represents the number of cancer types. (i) Effect of mutations in different P53 domains on P53 protein expression. The p-values are based on Wilcoxon test. (j) Box plots showing differential TP53 mRNA (left) and P53 protein (right) expression in TP53 wild-type samples (N = 159) and those harboring mutations in P53 DNA binding domain (DBD domain; N = 267). (k) Box plots showing differential RPPA expression of P53 between TP53 nonsense mutant (N = 40) and wild-type samples (N = 159). (l) Box plots showing differential gene dependency of TP53 between TP53 mutant (N = 364) and wild-type samples (N = 159). (i-l) N represents the number of cell line samples in each group. (m) A bar plot showing the correlation of gene dependency with PKAA protein and mRNA expression in different cell line lineages. Significant correlations (FDR < 0.1) are marked by an asterisk. (n) A bar plot summarizing the number of lineages in which each protein marker is significantly associated with their gene dependency but not with mRNA (FDR < 0.1). N represents the number of protein markers (phase I: N = 21; phase II: N = 25). (c-l) The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively.
Extended Data Fig. 4 Differential effects of BRAF mutations on MEK1/2 mRNA and protein expression.
(a) A scatter plot showing the correlation between MEK1/2_pS217S221 RPPA and BRAF gene dependency in melanoma cell lines (N = 36). Shaded areas denote the 95% confidence intervals. (b-c) Box plots showing differential expression of MEK1 mRNA (b), MEK2 mRNA (c) based on the functional effects of BRAF mutations characterized by the cell viability assays. (a-c) N represents the number of cell line samples. (d) Box plots showing the differential expression of MEK1/2_pS217S221 RPPA between the patient samples with an activating BRAF mutation (N = 119) and those with wild-type BRAF (N = 60). (b-d) The p-values are based on Wilcoxon test. N represents the number of patient samples. (e) A heatmap showing detailed information on the BRAF mutations, their functional effects, and the corresponding MEK1/2_pS217S221 RPPA expression pattern. (b-d) The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively.
Extended Data Fig. 5 In vitro validation of PKAA-EGFR synthetic lethal interaction.
(a) A Kaplan-Meier plot showing distinct survival probabilities of bladder cancer patients with both low PKAA and EGFR protein levels. The p-value is based on a log-rank test. Shaded areas denote the 95% confidence intervals. N represents the number of patients in each group (both low: N = 77; and others: N = 257). (b) A forest plot of hazard ratios for PKAA protein and clinical variables. The p-values are based on a multivariate Cox proportional hazards model. The center point of each horizontal bar represents the estimated hazard ratio. N represents the number of patients. (c) Box plots showing the differential gene dependency of PKAA between samples with EGFR deletion (N = 125) and amplification (N = 463). The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively. N represents the number of cell line samples. (d) Enrichment of EGFR inhibitors in drugs resistant to high PKAA levels. The p-value is based on Fisher’s exact test. (e, f) Relative mRNA level of PKAA in PKAA-KD A549 or H226 cells. N = 2 independent replicates were examined for each condition. (g-l) Drug response assays at 72 h for A549 (g-i) or H226 (j-l) PKAA-KD and control cells treated with three EGFR inhibitors, Afatinib, Erlotinib, and Osimertinib (DMSO and 9 drug concentrations). N = 3 independent replicates were examined for each treatment and perturbation. Data are shown as mean ± SEM. The p-values are based on ANOVA.
Extended Data Fig. 6 Evaluation of metastasis markers in cancer cell lines and patient samples.
(a) Box plots showing the log2(FC) between metastatic and primary cell lines of anti- (N = 37) and pro-metastasis (N = 35) marker RPPA expression. The p-value is based on the Wilcoxon test. The middle line in the box is the median, the bottom and top of the box are the first and third quartiles, and the whiskers extend to the 1.5× interquartile range of the lower and the upper quartiles, respectively. N represents the number of protein markers in each group. (b, c) Relative mRNA level of CDK9 in CDK9-KD and control MD-MB-231 or A549 cells. N = 3 independent replicates were examined for each perturbation. Data are shown as mean ± SEM. (d, e) ROC curves showing predictive powers of GCLM expression in ACC patients (N = 44) (d) and CHK1 in SARC patients (N = 223) (e) between metastatic and non-metastatic primary tumor samples. N represents the number of patients. (f) A pie chart showing the distribution of drug development levels for all the identified pro-metastasis protein markers. The annotation data was obtained from the Pharos database. N represents the number of protein markers in each group (Tclin: N = 6; Tchem: N = 13; and Tbio: N = 15).
Supplementary information
Supplementary Tables
Supplementary Table 1: Information on TCGA samples profiled in this study. Supplementary Table 2: Information on CCLE samples profiled in this study. Supplementary Table 3: Information on RPPA500 protein markers and corresponding antibodies. Supplementary Table 4: Information on hallmark gene sets related protein markers. Supplementary Table 5: Predicted synthetic lethal protein pairs.
Source data
Source Data Fig. 1
Statistical source data.
Source Data Fig. 2
Statistical source data.
Source Data Fig. 3
Statistical source data.
Source Data Fig. 4
Statistical source data.
Source Data Fig. 5
Statistical source data.
Source Data Fig. 6
Statistical source data.
Source Data Extended Data Fig. 1
Statistical source data.
Source Data Extended Data Fig. 2
Statistical source data.
Source Data Extended Data Fig. 3
Statistical source data.
Source Data Extended Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 6
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, J., Liu, W., Mojumdar, K. et al. A protein expression atlas on tissue samples and cell lines from cancer patients provides insights into tumor heterogeneity and dependencies. Nat Cancer (2024). https://doi.org/10.1038/s43018-024-00817-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s43018-024-00817-x