Abstract
We report a proteogenomic analysis of pancreatic ductal adenocarcinoma (PDAC). Mutation–phosphorylation correlations identified signaling pathways associated with somatic mutations in significantly mutated genes. Messenger RNA–protein abundance correlations revealed potential prognostic biomarkers correlated with patient survival. Integrated clustering of mRNA, protein and phosphorylation data identified six PDAC subtypes. Cellular pathways represented by mRNA and protein signatures, defining the subtypes and compositions of cell types in the subtypes, characterized them as classical progenitor (TS1), squamous (TS2–4), immunogenic progenitor (IS1) and exocrine-like (IS2) subtypes. Compared with the mRNA data, protein and phosphorylation data further classified the squamous subtypes into activated stroma-enriched (TS2), invasive (TS3) and invasive-proliferative (TS4) squamous subtypes. Orthotopic mouse PDAC models revealed a higher number of pro-tumorigenic immune cells in TS4, inhibiting T cell proliferation. Our proteogenomic analysis provides significantly mutated genes/biomarkers, cellular pathways and cell types as potential therapeutic targets to improve stratification of patients with PDAC.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 per month
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout








Data availability
The exome and mRNA sequencing data generated in this study are available in dbGaP (https://www.ncbi.nlm.nih.gov/gap/; accession ID: phs002347.v1.p1). Mass spectrometry-based global and phosphoproteomic data are available in the PDC (https://pdc.cancer.gov; accession ID: PDC000248 for global proteome and PDC000249 for phosphoproteome). Previously published somatic mutation, mRNA expression and clinical data for PDAC cohorts that were reanalyzed here were downloaded from the GDC data portal (https://portal.gdc.cancer.gov/) and the ICGC data portal (https://dcc.icgc.org/). Previously published scRNA-seq data for PDAC patients reanalyzed here are available under accession code GSE127471. SwissProt protein sequence data (v.2018.01) was downloaded from the UniProt database (https://ftp.uniprot.org/pub/databases/uniprot/previous_releases/release-2018_01/knowledgebase/) to construct the multiplexed unified protein database. Source data are provided with this paper. All other relevant data supporting the findings of this study are available from the corresponding authors on reasonable request.
Code availability
All of the custom codes used for the analyses included in our manuscript were uploaded to GitHub repository with instructions for users: https://github.com/doyoungh/Hyeon_et_al_PDAC_Nature_Cancer.
References
Collisson, E. A., Bailey, P., Chang, D. K. & Biankin, A. V. Molecular subtypes of pancreatic cancer. Nat. Rev. Gastroenterol. Hepatol. 16, 207–220 (2019).
Ansari, D., Toren, W., Zhou, Q., Hu, D. & Andersson, R. Proteomic and genomic profiling of pancreatic cancer. Cell Biol. Toxicol. 35, 333–343 (2019).
Kafita, D., Nkhoma, P., Zulu, M. & Sinkala, M. Proteogenomic analysis of pancreatic cancer subtypes. PLoS ONE 16, e0257084 (2021).
Mertins, P. et al. Proteogenomics connects somatic mutations to signalling in breast cancer. Nature 534, 55–62 (2016).
Mun, D. G. et al. Proteogenomic characterization of human early-onset gastric cancer. Cancer Cell 35, 111–124 e110 (2019).
Zhang, B. et al. Proteogenomic characterization of human colon and rectal cancer. Nature 513, 382–387 (2014).
Zhang, H. et al. Integrated proteogenomic characterization of human high-grade serous ovarian cancer. Cell 166, 755–765 (2016).
Song, S. et al. qpure: a tool to estimate tumor cellularity from genome-wide single-nucleotide polymorphism profiles. PLoS ONE 7, e45835 (2012).
Park, H. et al. Compact variant-rich customized sequence database and a fast and sensitive database search for efficient proteogenomic analyses. Proteomics https://doi.org/10.1002/pmic.201400225 (2014).
The Cancer Genome Atlas Research Network. Integrated genomic characterization of pancreatic ductal adenocarcinoma. Cancer Cell 32, 185–203.e13 (2017).
Bailey, P. et al. Genomic analyses identify molecular subtypes of pancreatic cancer. Nature 531, 47–52 (2016).
Biankin, A. V. et al. Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes. Nature 491, 399–405 (2012).
Waddell, N. et al. Whole genomes redefine the mutational landscape of pancreatic cancer. Nature 518, 495–501 (2015).
Witkiewicz, A. K. et al. Whole-exome sequencing of pancreatic cancer defines genetic diversity and therapeutic targets. Nat. Commun. 6, 6744 (2015).
Vasaikar, S. et al. Proteogenomic analysis of human colon cancer reveals new therapeutic opportunities. Cell 177, 1035–1049.e19 (2019).
Lu, D., Wang, J., Shi, X., Yue, B. & Hao, J. AHNAK2 is a potential prognostic biomarker in patients with PDAC. Oncotarget 8, 31775–31784 (2017).
Cao, L. et al. Proteogenomic characterization of pancreatic ductal adenocarcinoma. Cell 184, 5031–5052 e5026 (2021).
Scarlett, C. J., Salisbury, E. L., Biankin, A. V. & Kench, J. Precursor lesions in pancreatic cancer: morphological and molecular pathology. Pathology 43, 183–200 (2011).
Zhang, J. et al. International Cancer Genome Consortium Data Portal—a one-stop shop for cancer genomics data. Database (Oxford) 2011, bar026 (2011).
Collisson, E. A. et al. Subtypes of pancreatic ductal adenocarcinoma and their differing responses to therapy. Nat. Med. 17, 500–503 (2011).
Moffitt, R. A. et al. Virtual microdissection identifies distinct tumor- and stroma-specific subtypes of pancreatic ductal adenocarcinoma. Nat. Genet. 47, 1168–1178 (2015).
Puleo, F. et al. Stratification of pancreatic ductal adenocarcinomas based on tumor and microenvironment features. Gastroenterology 155, 1999–2013.e3 (2018).
Yoshihara, K. et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat. Commun. 4, 2612 (2013).
Tibshirani, R., Hastie, T., Narasimhan, B. & Chu, G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl Acad. Sci. USA 99, 6567–6572 (2002).
Kamburov, A., Wierling, C., Lehrach, H. & Herwig, R. ConsensusPathDB—a database for integrating human functional interaction networks. Nucleic Acids Res. 37, D623–D628 (2009).
Maurer, C. et al. Experimental microdissection enables functional harmonisation of pancreatic cancer subtypes. Gut 68, 1034–1043 (2019).
Peng, X. L., Moffitt, R. A., Torphy, R. J., Volmar, K. E. & Yeh, J. J. De novo compartment deconvolution and weight estimation of tumor samples using DECODER. Nat. Commun. 10, 4729 (2019).
Peng, J. et al. Single-cell RNA-seq highlights intra-tumoral heterogeneity and malignant progression in pancreatic ductal adenocarcinoma. Cell Res. 29, 725–738 (2019).
Krug, K. et al. Proteogenomic landscape of breast cancer tumorigenesis and targeted therapy. Cell 183, 1436–1456 e1431 (2020).
Xu, J. Y. et al. Integrative proteomic characterization of human lung adenocarcinoma. Cell 182, 245–261.e17 (2020).
Jeong, J., Suh, Y. & Jung, K. Context drives diversification of monocytes and neutrophils in orchestrating the tumor microenvironment. Front. Immunol. 10, 1817 (2019).
Jung, K. et al. Targeting CXCR4-dependent immunosuppressive Ly6Clow monocytes improves antiangiogenic therapy in colorectal cancer. Proc. Natl Acad. Sci. USA 114, 10455–10460 (2017).
Jung, K. et al. Ly6Clo monocytes drive immunosuppression and confer resistance to anti-VEGFR2 cancer therapy. J. Clin. Invest. 127, 3039–3051 (2017).
Torres, M. P., Chakraborty, S., Souchek, J. & Batra, S. K. Mucin-based targeted pancreatic cancer therapy. Curr. Pharm. Des. 18, 2472–2481 (2012).
Wang, S., You, L., Dai, M. & Zhao, Y. Mucins in pancreatic cancer: a well-established but promising family for diagnosis, prognosis and therapy. J. Cell. Mol. Med. 24, 10279–10289 (2020).
Qin, S. et al. Novel immune checkpoint targets: moving beyond PD-1 and CTLA-4. Mol. Cancer 18, 155 (2019).
Kim, S., Kim, S. A., Han, J. & Kim, I. S. Rho-kinase as a target for cancer therapy and its immunotherapeutic potential. Int. J. Mol. Sci. https://doi.org/10.3390/ijms222312916 (2021).
Wei, L., Surma, M., Shi, S., Lambert-Cheatham, N. & Shi, J. Novel insights into the roles of Rho kinase in cancer. Arch. Immunol. Ther. Exp. (Warsz.) 64, 259–278 (2016).
Nishiwada, S. et al. Clinical significance of CD155 expression in human pancreatic cancer. Anticancer Res. 35, 2287–2297 (2015).
Freed-Pastor, W. A. et al. The CD155/TIGIT axis promotes and maintains immune evasion in neoantigen-expressing pancreatic cancer. Cancer Cell 39, 1342–1360.e14 (2021).
Xin, L. et al. Fibroblast activation protein-α as a target in the bench-to-bedside diagnosis and treatment of tumors: a narrative review. Front. Oncol. 11, 648187 (2021).
Jeong, J. et al. Tumor-infiltrating neutrophils and non-classical monocytes may be potential therapeutic targets for HER2negative gastric cancer. Immune Netw. 21, e31 (2021).
Wijnen, R. et al. Cyclin dependent kinase-1 (CDK-1) inhibition as a novel therapeutic strategy against pancreatic ductal adenocarcinoma (PDAC). Cancers (Basel) https://doi.org/10.3390/cancers13174389 (2021).
Hyeon, D. Y. et al. Whole exome sequencing and RNA sequencing analyses of PDAC samples. Nat. Protoc. https://doi.org/10.21203/rs.3.pex-2062/v1 (2022).
Hyeon, D. Y. et al. Mass spectrometry-based proteomic analysis of PDAC samples. Nat. Protoc. https://doi.org/10.21203/rs.3.pex-2063/v1 (2022).
Hwang, D. et al. A data integration methodology for systems biology. Proc. Natl Acad. Sci. USA 102, 17296–17301 (2005).
Gibbons, J. D. Nonparametric statistical inference. Biometrics 28, 1148–1149 (1972).
Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
Hyeon, D. Y. et al. Cell-based assays for potential prognostic biomarkers in PDAC. Nat. Protoc. https://doi.org/10.21203/rs.3.pex-2065/v1 (2022).
Kim, Y. et al. Principal network analysis: identification of subnetworks representing major dynamics using gene expression data. Bioinformatics 27, 391–398 (2011).
Brunet, J. P., Tamayo, P., Golub, T. R. & Mesirov, J. P. Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl Acad. Sci. USA 101, 4164–4169 (2004).
Bhin, J. et al. PGC-enriched miRNAs control germ cell development. Mol. Cells 38, 895–903 (2015).
Hwang, D. et al. A data integration methodology for systems biology: experimental verification. Proc. Natl Acad. Sci. USA 102, 17302–17307 (2005).
Newman, A. M. et al. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nat. Biotechnol. 37, 773–782 (2019).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902.e21 (2019).
Xu, C. & Su, Z. Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics 31, 1974–1980 (2015).
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000).
Hyeon, D. Y. et al. Bioinformatics analysis of PDAC subtypes. Nat. Protoc. https://doi.org/10.21203/rs.3.pex-2064/v1 (2022).
Hyeon, D. Y. et al. Assays for orthotopic PDAC mouse models. Nat. Protoc. https://doi.org/10.21203/rs.3.pex-2066/v1 (2022).
Allen, H. et al. Human placental-derived adherent stromal cells co-induced with TNF-α and IFN-γ inhibit triple-negative breast cancer in nude mouse xenograft models. Sci. Rep. 8, 670 (2018).
Dong, L. et al. Proteogenomic characterization identifies clinically relevant subgroups of intrahepatic cholangiocarcinoma. Cancer Cell 40, 70–87.e15 (2022).
Wang, J. multiOmicsViz: plot the effect of one omics data on other omics data along the chromosome. R package version 1.20.0 (2022).
Ghandi, M. et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).
Acknowledgements
This work was supported by grants from the Collaborative Genome Program for Fostering New Post-Genome Industry through the National Research Foundation (grant no. NRF-2017M3C9A5031397) funded by the Korean Ministry of Science and ICT. This work was also supported by grants from the Creative Research Initiative Program (grant no. NRF-2017R1A3B1023387 to S.H.B.) and the National Research Foundation of Korea (grant no. NRF-2020R1C1C1015062 to K.J., grant no. NRF-2022M3H9A2086450 to S.-W.L. and grant no. 2022R1A2C2011122 to J.-Y.J.). This work was conducted under the auspices of a Memorandum of Understanding between Korea University’s Center for Proteogenome Research and the US National Cancer Institute’s International Cancer Proteogenome Consortium (ICPC). The ICPC encourages international cooperation among institutions and nations in proteogenomic cancer research in which proteogenomic datasets are made available to the public. This study was also conducted in collaboration with the US National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium. The biological specimens used in this study were provided by the Biobank at Seoul National University Hospital (SNUH), a member of the Korean Biobank Network (KBN4_A03), and SNUH Cancer Tissue Bank.
Author information
Authors and Affiliations
Contributions
S.-W.L., D.H., J.-Y.J., S.H.B., K.J. and D.P. designed and directed the integrated proteogenomic analysis. Y.H., W.K., Y.-A.S., Hongbeom K. and J.-Y.J. collected, characterized and processed the tumor samples. D.K., S.H., D.P., D.Y.H., D.H., T.G. and T.P. performed genomic analysis of tumor samples and analyzed the genomic data. S.-W.L. and D.H. designed the proteomic experiments. D.N., J.B., S.B., D.-G.M., I.H.M., H.L., S.-J.K., Hokeun K., S.-W.L., S.C., E.P. and M.-S.K. performed global proteome and phosphoproteome profiling experiments, unified database searches and analyzed the proteomic data. D.Y.H., D.H., D.K. and D.P. performed integrated analyses with genomic and proteomic data, as well as clustering, network and deconvolution analyses. J.-L.K. established primary cell lines from the human tumor samples. G.K., C.R.K., S.A.C., Y.R.K. and S.H.B. performed functional experiments to test the prognostic biomarkers. K.B.L. performed immunohistochemistry experiments to validate the markers for cell types and pathways in the network models. D.K.K., J.J., S.J., Y.W.C. and K.J. performed functional experiments to test the interactions between PMN-MDSCs and T cells. D.H., S.-W.L., K.J., S.H.B., J.-Y.J., D.P., D.Y.H., D.N., Y.H., D.K.K., G.K. and D.K. wrote the manuscript. All authors have read and approved the final version of the manuscript for publication.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Cancer thanks Steven Gallinger, Karin Rodland and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Proteogenomic analysis of PDAC.
a, The workflow for proteogenomic analysis of PDAC. Exome-sequencing analysis of cancer tissues and blood samples and RNA-sequencing of cancer tissues were performed for tissue samples from 196 patients while mass spectrometry-based proteomic analyses (global proteome and phosphoproteome) were performed for high purity samples (cellularity ≥ 19%) from 150 patients. Distribution of tumor cellularity was shown for 196 samples (bottom left). Only the samples with cellularity ≥ 15% were used for the analysis. In the box plot, the center line indicates median value and the box limits indicate upper and lower quartiles. b, Numbers of non-redundant peptides identified from global proteome and phosphoproteome data. c, Numbers of protein-coding genes identified from mRNA sequencing and proteome data (global proteome and phosphoproteome). The average numbers of peptides and protein-coding genes are indicated in (b) and (c), respectively. d, Numbers of somatic mutations altering protein sequences and genes carrying the mutations identified from exome-sequencing data. e, Relationships among significantly mutated genes (SMGs) identified from our cohort and previous cohorts of the TCGA10, Bailey et al.11, Biankin et al.12, Waddell et al.13, and Witkiewicz et al.14.
Extended Data Fig. 2 Mutation-protein abundance and phosphorylation correlations reveal implication of apoptosis and the actin cytoskeleton in PDAC.
a-g, Association of somatic mutations with protein abundance or phosphorylation level in TP53 (a), RB1 (b), ATM (c), ARID1A (d), KMT2D (e), AHNAK2 (f), and FCGBP (g). Lollipop plots show the detected somatic mutations (circles) and phosphorylation sites (triangles) in the gene structures (top). The height of the lollipop indicates the number of patients with the corresponding mutations, and colours indicate the mutation types (see legend). The samples were sorted based on the somatic mutations. Protein abundance or intensity of the phosphorylated peptides was normalised relative to their median levels across all patients displayed in bar plots (bottom; red and blue, higher and lower than the median, respectively). Enriched mutation sites are indicated by arrows. h, Correlations of copy number variations (CNVs) with mRNA (left) and protein (right) expression levels (n = 93 patients). Red and blue indicate significant (FDR < 0.05 determined by multiOmicsViz64) positive and negative correlations, respectively. Diagonal and off-diagonal elements in the heat maps indicate cis- and trans-correlations between CNVs and expression levels of mRNA or protein. Along the chromosome, the numbers of CNVs that correlated specifically with either mRNA or protein expression levels and commonly with both mRNA and protein expression levels were displayed in blue (top) and black (bottom) bars, respectively.
Extended Data Fig. 3 mRNA-protein abundance correlation suggests potential prognostic biomarkers.
a, Distribution of Spearman’s correlation coefficients of mRNA and protein abundances of individual genes across patients (n = 150 patients). Yellow and blue, positive and negative correlations, respectively. b, Differential association of the genes having high and low mRNA-protein abundance correlations with KEGG pathways (n = 150 patients). The genes involved in each KEGG pathway were indicated by yellow (positive correlation) and blue (negative correlation) bars. c, Cumulative density distributions of mRNA-protein correlations for the genes with significant (blue) and non-significant (red) survival differences between top and bottom 25% of patients with the highest and the lowest mRNA expression levels, respectively. p = 1.4 × 10−18 by two-sided Kolmogorov-Smirnov test. d, Distributions of survival differences (Chi-square statistic values) for the genes with significant (FDR < 0.01) and non-significant (FDR > 0.1) mRNA–protein correlations. p = 3.2 × 10−18 by two-sided Student’s t-test. In violin plots, the line indicates the median value. n = 150 and 196 patients for mRNA-protein correlation and survival analysis, respectively (c-d). e, Cellular pathways represented by the genes showing negative mRNA-survival correlations. The enrichment significance of each pathway is indicated as –log10 (p-value), where p-value is the p-value for enrichment from ConsensusPathDB. Two dotted lines for p = 0.05 and 0.01. f, Selection of prognostic biomarkers for functional experiments. Among 19 prognostic biomarkers with significant negative (hazard ratio>1) mRNA-survival correlations in the four PDAC cohorts, the candidates with desirable survival curve patterns (red and blue lines, top and bottom 25% of patients with the highest and the lowest mRNA expression levels) in at least 3 PDAC cohorts were selected, resulting in 16 prognostic biomarkers. Among them, 7 prognostic biomarkers that have not been previously reported in pancreatic cancer were selected. Finally, among them, the 5 prognostic biomarkers having expression levels larger than the median expression (log2-FPKM = 3.11) of expressed genes (FPKM > 1) in AsPC1 cells. Among them, KRT19 was excluded due to its involvement in the broad spectrum of functions. The mRNA expression profile of AsPC1 cells was obtained from Cancer Cell Line Encyclopedia (CCLE)65. FPKM, fragments per kilobase of transcript per million.
Extended Data Fig. 4 Functional validation of potential prognostic biomarkers by cell-based assays in PANC1 cells.
a, Results of cell-based assays including proliferation curve, 3D spheroid formation, transwell migration, and matrigel invasion using PANC1 cells are summarized in the table. Each value written in the table corresponds to log2 value of fold change. Fold change was calculated as a relative quantification of results from respective assays comparing proliferation/migration of cells expressing each shRNA to that of cells expressing control shRNA (shCtrl). b, Knock-down efficiency of each shRNA targeting potential prognostic biomarkers was measured with immunoblot analysis. Knock-down efficiency was validated three times independently. c, Proliferation curves of PANC1 cells were drawn from the result of cell counting upon the expression of shCtrl or the indicated shRNA targeting prognostic biomarkers (n = 3 independent experiments). P < 0.0001 for all shRNAs. d, 3D spheroid volume of PANC1 cells after 7 days is presented as bar graph with representative images (n = 5 independent spheroids). Scale bar, 500 μm. P < 0.0001 for all shRNAs. e, Representative images from the transwell migration with quantification of average number of migrated PANC1 cells expressing the indicated shRNAs in each image are shown (n = 5 independent assays). Scale bar, 200 μm. P = 7.0 × 10−4 for shPPL-2 and P < 0.0001 for the others. f, Representative images from the matrigel invasion assay with quantification of average number of migrated PANC1 cells expressing the indicated shRNAs in each image are shown (n = 5 independent assays). Scale bar, 200 μm. P = 4.0 × 10−4 for shTPI1-1, P = 1.0 × 10−3 for shTPI1-2, P = 5.2 × 10−3 for shDCBLD2-1, and P < 0.0001 for the others. *, P < 0.05; **, P < 0.01; ***, P < 0.001; ****, P < 0.0001 by two-way (c) and one-way (d-f) analysis of variance (ANOVA) with Dunnett’s post hoc correction. All data are shown as mean ± SEM (c-f).
Extended Data Fig. 5 Clustering of samples using mRNA, protein, or phosphorylation data.
a, Two-stage clustering scheme. See Methods. b, Scatter plots comparing the four cellularity measures pairwise (n = 196, 196, 196, and 134 samples for qpure, KRAS VAF, ESTIMATE, and Histological, respectively). Spearman’s correlations and two-sided adjusted P-values (P) by random permutation experiments are shown. c, Two-stage clustering results of mRNA data (n = 98 and 196 samples for 1st and 2nd clustering, respectively). In the 1st (left) or 2nd (right) clustering, cophenetic correlation coefficient plots show how the coefficients vary with different numbers of clusters and also when different numbers of the molecules selected with multiple percentages (10 to 30%) of median absolute deviations (MADs) were used for clustering. The heat maps show the sample consensus obtained from pair-wise clustering. Blue-to-red gradient denotes the percentage of agreement in the clustering results in 100 clustering trials with a determined number of clusters. d-e, Two-stage clustering results of protein (d) and phosphorylation (e) data (n = 95 and 150 samples for 1st and 2nd clustering, respectively). See the legend in c. Colored bars in the bottom represent the samples belonging to the clusters: hi-RNA1-2 and RNA1-3 (c); hi-Prot1-3 and Prot1-5 (d); and hi-Phos1-3 and Phos1-5 (e).
Extended Data Fig. 6 Molecular signatures and cell types define the characteristics of the subtypes.
a, How mRNA and protein signatures that defined Sub6 are selected and used for GSEA are shown as an example (n = 150 tumors). Integrative clustering (top left) shows that Sub6 was defined by RNA3, Prot5, and Phos5 clusters. Based on this information, 702 genes (rna3) defining RNA3 were selected as mRNA signatures (S6-G), and 1022 proteins (prot5) and 657 phosphopeptides (phos5) defining Prot5 and Phos5, respectively, were selected as protein signatures (S6-P) (bottom and top right). After mapping the phosphopeptides into phorphorylated proteins and combining them with 1022 proteins (prot5), the resulting proteins were used for GSEA. b, UMAP plot showing 9 clusters of the cell types indicated. Color coding was used to distinguish clusters. c, Genes specifically up-regulated in the 9 cell clusters. The heat map shows increased (yellow) and decreased (purple) expression (Z-score) of the genes up-regulated (row) in at least one cluster with respect to the median expression across all cells (column). Z-score was obtained from an inverse normal distribution of mRNA expression level. The color bar represents the gradient of the normalized Z-score. d, Violin plots showing distributions of mRNA expression levels for the selected representative genes across the 9 cell clusters. e, Distributions of tumor cellularity measurements determined by the methods indicated for TS1–4 and IS1–2 (n = 37, 18, 14, 33, 26, and 22 tumors for TS1-4 and IS1-2, respectively, in qpure cellularity, KRAS VAF, and ESTIMATE cellularity; n = 33, 15, 13, 30, 24, and 19 tumors for TS1-4 and IS1-2, respectively, in Histological cellularity). In violin plots, the line indicates the median value.
Extended Data Fig. 7 Immunohistochemistry (IHC) analysis of markers for cell types and pathways in network models.
a-f, Representative IHC images (top), quantification of protein expression by IHC staining (bottom left), and protein expression levels from global proteome data (bottom right) for AGR2 (a), ITGA2 (b), FAP (c), FZD1 (d), PCNA (e), and MCM4 (f) (n = 12, 10, 10, and 13 samples for TS1-4, respectively). H-scores were used for quantification. However, as quantification measures, the percentage of positive cells in each tumor was used for PCNA, and a fraction of positive samples in each TS was used for MCM4. Scale bars = 100 μm. AGR2, P = 3.3 × 10−7 for TS1 vs. other TSs; ITGA2, P = 3.8 × 10−2 for TS2-4 vs. TS1; FAP, P = 2.6 × 10−3 for TS2 vs. other TSs; FZD1 (tumor), P = 8.5 × 10−3 for TS2 vs. other TSs; FZD1 (stroma), P = 1.4 × 10−2 for TS3 vs. other TSs; PCNA, P = 3.2 × 10−6 for TS4 vs. other TSs; and MCM4, P = 2.9 × 10−3 for TS4 vs. other TSs. In violin plots, the line indicates the median value. *, P < 0.05; **, P < 0.01; ***, P < 0.001 by one-sided Student’s t-test (a-e) and Fisher’s exact test (f).
Extended Data Fig. 8 PDAC subtypes are associated with distinct cellular networks.
a, A network model showing interactions between the genes and proteins involved in immune-related pathways (top) and pancreatic secretion (bottom) associated with IS1–2. Node colours (centre and boundary) indicate whether the corresponding gene and protein were selected as signatures for IS1–2 (green for IS1 and dark green for IS2). Presence of a circled P on a node indicates phosphorylated peptides that define the corresponding subtypes. Nodes with blue label indicate kinases predominantly activated in IS1-2. Arrows, activation; inhibition symbols, inhibition; solid arrows, direct activation; dotted arrows, indirect activation; grey lines, protein-protein interactions. b, Heat map showing activities of the indicated kinases for the samples grouped according to the subtypes identified (n = 37, 18, 14, 33, 26, and 22 samples for TS1-4 and IS1-2, respectively). Color bar, gradient of normalized enrichment score (NESij) from GSEA, which indicates activity of kinase i in sample j. Subtypes associated with predominant kinase activation are indicated in parenthesis. c, Subtype distributions of the samples that harbored somatic mutations of TP53 or ARID1A and had altered protein abundance or phosphorylation level of the corresponding protein-encoding gene (n = 150 samples). The color in the heat maps below the bar plots indicate the subtype of patients. d, The number of patients with or without mutations for which protein abundances or intensities of phosphorylated peptides are higher (positive) or lower (negative) than the median across patients. The color in the stacked bar graphs indicate the subtype of patients.
Extended Data Fig. 9 PMN-MDSCs inhibit T cell proliferation.
a, mRNA/protein levels of immune cell markers across TS1-4 and IS1-2. The heat maps show mRNA (left) and protein (right) Z-scores of the markers in TS1-4 and IS1-2. In each subtype, for each marker, Z-scores were computed by auto-scaling the median level in the subtype using the median and standard deviation of levels across TS1-4 and IS1-2. b, Distributions of the marker levels for T cell (CD4 and CD8A) and neutrophil (CXCL1, CXCL8, and LCN2) in TS1-4 and IS1-2 (n = 37, 18, 14, 33, 26, and 22 tumors for TS1-4 and IS1-2, respectively). In violin plots, the median levels (center line) of the markers were shown. c, Schematic procedure for development of orthotropic PDAC models. d, Representative ultrasound images of SNU3608 and SNU3573 tumors at Day 8, 22, and 36. Dotted circles represent the tumors. e, Gross images of SNU3608 and SNU3573 tumors at Day 42. f, SNU3608 and SNU3573 tumor weights at Day 42 (n = 10 samples/group). P = 5.5 × 10−2 by two-sided Student’s t-test. g, FACS gating scheme for myeloid population and chemokine receptors. Contour lines, cell density distributions; solid lines, indicated cell populations; red arrow heads, FACS gating flow. h-j, Percentages of indicated immune cells in SNU3608 and SNU3573 tumors in blood (CD11b: Naїve vs. SNU3608, P = 1.8 × 10−3; Naive vs. SNU3573, P = 6.0 × 10−3; PMN-MDSSC: Naive vs. SNU3608, P = 2.2 × 10−3; Naive vs. SNU3573, P = 7.5 × 10−3; Neutrophils: Naive vs. SNU3608, P = 8.6 × 10−3; Naive vs. SNU3573, P = 1.6 × 10−2) (h), spleen (i), and bone marrow (BM, j). PMN- and M-MDSC, polymorphonuclear and monocytic myeloid-derived suppressor cells, respectively. k-n, Numbers or percentages of indicated PMN-MDSC groups in SNU3608 and SNU3573 tumors (P < 0.0001) (k), as well as in blood (Naїve vs SNU3608, P < 0.001; Naїve vs SNU3573, P = 2.0 × 10−4) (l), spleen (m), and BM (P = 9.0 × 10−4) (n) of Balb/c-nu mice carrying SNU3608 and SNU3573 tumors. Naїve, n = 3 samples of blood, n = 4 samples of spleen and BM; SNU3608, n = 8 samples; SNU3573, n = 10 samples (h-n). *, P < 0.05; **, P < 0.01; ***, P < 0.001; ****, P < 0.0001 by two-way ANOVA with Sidak’s posthoc corrections (h-n). All data are shown as mean ± SEM (f and h-n).
Extended Data Fig. 10 Higher level of infiltration of PMN-MDSCs is observed in TS4 tumors than in IS2 tumors.
a, Comparison of tumor volumes over time measured by ultrasonography in orthotopic PDAC models grafted with cells derived from TS4 (SNU3608) and IS2 (SNU3573 and SNU3752) tissues (SNU3608-2 vs. SNU3573-2, P = 0.0001; SNU3608 vs. SNU752-2, P < 0.0001). b-c, Comparison of volume (SNU3608-2 vs. SNU3573-2, P = 1.0 × 10−3; SNU3608-2 vs. SNU3752-2, P < 0.001) (b) and weights (P < 0.001) (c) of SNU3608, SNU3573, and SNU3752 tumors at the end point. d, Gross photos of SNU3608 (TS4), SNU3573 (IS2) and SNU3752 (IS2) tumors taken on Day 42. e-f, Percentages of immune cell subsets infiltrating SNU3608, SNU3573, and SNU3752 tumors (****, P < 0.0001; PMN-MDSC: SNU3573-2 vs. SNU3752-2, P = 3.2 × 10−3; Neutrophil: SNU3573-2 vs. SNU3752-2, P = 3.8 × 10−3) (e) and blood (****, P < 0.0001; CD11b: SNU3608-2 vs. SNU3573-2, P = 1.2 × 10−3; SNU3573-2 vs. SNU3752-2, P = 4.0 × 10−4; PMN-MDSC: SNU3573-2 vs. SNU3752-2, P = 8.0 × 10−4; Neutrophil: SNU3573-2 vs. SNU3752-2, P = 2.4 × 10−2) (f). g-h, Percentages of the four indicated groups of PMN-MDSCs defined based on expression levels of CXCR2 and CXCR4 measured in SNU3608, SNU3573 and SNU3752 tumors (****, P < 0.0001; CXCR2−CXCR4−: SNU3608-2 vs. SNU3573-2, P = 1.7 × 10−2; SNU3573-2 vs. SNU3752-2, P = 3.6 × 10−3) (g) and blood (P < 0.0001) (h). n = 9, 5, and 9 samples for SNU3608, SNU3573, and SNU3752, respectively (a-h). *, P < 0.05, **, P < 0.01, ***, P < 0.001, ****, P < 0.0001 by two-way analysis of variance (ANOVA) with Sidak’s post hoc corrections (a and e-h) and one-way ANOVA with Dunnett’s post hoc correction (b-c). All data are shown as mean ± SEM (a-c and e-h).
Supplementary information
Supplementary Table 1–7
Legends of supplementary tables are included in the first sheet of Supplementary_Tables.
Source data
Source Data Fig. 1
Statistical source data.
Source Data Fig. 2
Statistical source data.
Source Data Fig. 3
Statistical source data.
Source Data Fig. 3
Unprocessed western blots.
Source Data Fig. 4
Statistical source data.
Source Data Fig. 5
Statistical source data.
Source Data Fig. 6
Statistical source data.
Source Data Fig. 7
Statistical source data.
Source Data Fig. 8
Statistical source data.
Source Data Extended Data Fig. 1
Statistical source data.
Source Data Extended Data Fig. 2
Statistical source data.
Source Data Extended Data Fig. 3
Statistical source data.
Source Data Extended Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 4
Unprocessed western blots.
Source Data Extended Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 6
Statistical source data.
Source Data Extended Data Fig. 7
Statistical source data.
Source Data Extended Data Fig. 8
Statistical source data.
Source Data Extended Data Fig. 9
Statistical source data.
Source Data Extended Data Fig. 10
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hyeon, D.Y., Nam, D., Han, Y. et al. Proteogenomic landscape of human pancreatic ductal adenocarcinoma in an Asian population reveals tumor cell-enriched and immune-rich subtypes. Nat Cancer 4, 290–307 (2023). https://doi.org/10.1038/s43018-022-00479-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43018-022-00479-7