Spatially restricted drivers and transitional cell populations cooperate with the microenvironment in untreated and chemo-resistant pancreatic cancer

Pancreatic ductal adenocarcinoma is a lethal disease with limited treatment options and poor survival. We studied 83 spatial samples from 31 patients (11 treatment-naïve and 20 treated) using single-cell/nucleus RNA sequencing, bulk-proteogenomics, spatial transcriptomics and cellular imaging. Subpopulations of tumor cells exhibited signatures of proliferation, KRAS signaling, cell stress and epithelial-to-mesenchymal transition. Mapping mutations and copy number events distinguished tumor populations from normal and transitional cells, including acinar-to-ductal metaplasia and pancreatic intraepithelial neoplasia. Pathology-assisted deconvolution of spatial transcriptomic data identified tumor and transitional subpopulations with distinct histological features. We showed coordinated expression of TIGIT in exhausted and regulatory T cells and Nectin in tumor cells. Chemo-resistant samples contain a threefold enrichment of inflammatory cancer-associated fibroblasts that upregulate metallothioneins. Our study reveals a deeper understanding of the intricate substructure of pancreatic ductal adenocarcinoma tumors that could help improve therapy for patients with this disease.

HT125P1, two samples (S1H3 and S1H9) do not have a detectable KRAS mutation while the other two (S1H4 and S1H8) have a G12V mutation, and these samples segregate accordingly into the higher and lower phospho-signaling groups, suggestive of differential RAS activation within the same patient. In particular, the MAPK1 T185 phosphosite is differentially regulated between these samples, while MAPK1 Y187 is uniformly expressed throughout HT125P1.
Furthermore, the three cases enclosed in colored boxes, which match the heterogeneous cases in Extended Data Fig. 2h, show consistent heterogeneous protein signaling in the samples with different cell type composition, pointing to the importance of spatial sampling to obtain the full picture of heterogeneity in PDAC (Extended Data Fig. 2h,i).

Gene Expression and CNV Model for HT061P1
The tumor subpopulation in punch A is initially driven by KRAS G12V and the KRAS G12V population, then acquired an amplification in GATA6 and a 17p deletion. In punches B and C (and a minute portion of A), the initial driver was KRAS G12D, followed by a gain of AKT2, MYC, and 1q, with additional subsets of cells acquiring either ERBB2 or GATA6 amplification and PTEN deletion (Fig. 3d,e and Extended Data Fig. 2f). Given the extent of heterogeneity observed across all samples, these results suggest a vast degree of tumor heterogeneity in PDAC generally at the expression, mutational, and CNV levels that may impact tumor growth and progression (Extended Data Fig. 2g).

CAF TME-Remodeling Pathways
To better understand the role of CAFs in tumorigenesis, we analyzed the expression patterns of CAF subtypes to test whether they were enriched for TME-remodeling pathways. Aside from myCAFs, we observed a significant reduction of CAV1 and CAV2 expression, previously linked to poor clinical outcomes 1,2 , in all CAFs compared to fibroblasts present in NAT samples (Extended Data Fig. 7c). In addition to their very high levels of CXCR4 expression, CXCR4+ iCAFs express high levels of its ligand, CXCL12 (Extended Data Fig. 7d) [3][4][5] . While apCAFs have high expression of NFE2L2, which is involved in oxidative damage repair, myCAFs have high expression of HIF1A, which regulates tolerance to hypoxic environments (Extended Data Fig.   7e). Together, our results suggest different roles of CAF subtypes in remodeling of the TME 6,7 .

Lymphocyte Subsets Breakdown
Within the lymphocyte subsets, we assigned states to CD4+ and CD8+ T based on their exhaustion, proliferation, and cytotoxic markers (Extended Data Fig. 8d,e). We observed similar percentages of cell types across treatment groups, with slightly higher abundance of CD4+ T cells in FOLFIRINOX samples and higher numbers of CD8+ T cells in treated samples (Extended Data   Fig. 8f). CD4+ T cells and Tregs in FOLFIRINOX samples had high expression of heat shock genes, such as HSPA1A, HSPA1B, HSPH1, and HSPD1, compared to other treatment groups (Extended Data Fig. 8g). Further, pathway enrichment analyses revealed a large number of cellular responses to heat stress in both CD4+ T cells and Tregs in these samples (Extended Data Fig. 8h).

Pathological Parameters and Assessment
Each tumor that is subdivided into smaller increments is subjected to H&E staining and assessed by a pathologist for the following parameters: percentage of viable tumor present, tumor differentiation, presence of recognizable pancreatic parenchyma surrounding or interspersed between tumor, lymphovascular invasion, and perineural invasion. Both slices of each tumor piece, both L1 and L4 when available, were subjected to assessment. For the correlation with scRNA-based tumor percentages, we averaged the top and bottom slide (L1 and L4) tumor estimates.

Protein Extraction and Lys-C/Trypsin Tandem Digestion
Tissue lysis and downstream sample preparation for global proteomic and phosphoproteomic analysis were carried out as previously described 9,10 . Approximately 25-50 mg of each cryopulverized HTAN tissue was resuspended in lysis buffer (8 M urea, 75 mM NaCl, 50 mM Tris, pH 8.0, 1 mM EDTA, 2 µg/mL aprotinin, 10 µg/mL leupeptin, 1 mM PMSF, 10 mM NaF, Phosphatase Inhibitor Cocktail 2 and Phosphatase Inhibitor Cocktail 3 [1:100 dilution], and 20 µM PUGNAc) by repeated vortexing. Lysates were clarified by centrifugation at 20,000 g for 10 min at 4°C, and protein concentrations were determined by BCA assay (Pierce). Proteins were reduced with 5 mM dithiothreitol (DTT, ThermoFisher) for 1 h at 37°C, and subsequently alkylated with 10 mM iodoacetamide (Sigma) for 45 min at room temperature (RT) in the dark. Samples were diluted 1:4 with 50 mM Tris-HCl (pH 8.0) and subjected to proteolytic digestion with LysC (Wako Chemicals) at 1 mAU:50 mg enzyme-to-substrate ratio for 2 h at RT, followed by the addition of sequencing grade modified trypsin (Promega) at 1:50 enzyme-to-substrate ratio and overnight incubation at RT. The digested samples were then acidified with 50% formic acid (FA, Fisher Chemicals) to pH 2. Tryptic peptides were desalted on reversed phase C18 SPE columns (Waters) and dried using Speed-Vac (Thermo Scientific).

TMT-11 Labeling of Peptides
Dried peptides from each sample were labeled with 11-plex TMT (Tandem Mass Tag) reagents (Thermo Fisher Scientific). 200 µg of peptides from each of the HTAN samples was dissolved in 80 µL of 100 mM HEPES, pH 8.5 solution. 30 HTAN samples were labeled in 3 TMT sets. A reference sample was created by pooling an aliquot from 26 HTAN samples (representing ~90% of the sample cohort) and was included in all TMT 11-plex sets as a pooled reference channel (Channel 126). 5 mg of TMT reagent was dissolved in 500 µL of anhydrous acetonitrile, and then 30 µL of each TMT reagent was added to the corresponding aliquot of peptides. After 1 h incubation at RT, the reaction was quenched by incubation with 5% NH2OH for 15 min at RT.
Following labeling, peptides were desalted on reversed phase C18 SPE columns (Waters) and dried using Speed-Vac (Thermo Scientific).

Peptide Fractionation by Basic Reversed-phase Liquid Chromatography (bRPLC)
To reduce the likelihood of peptides co-isolating and co-fragmenting due to high sample complexity, we employed extensive, high-resolution fractionation via basic reversed phase liquid chromatography (bRPLC). For each TMT set, about 2.2 mg of desalted peptides was reconstituted in 900 µL of 5 mM ammonium formate (pH 10) and 2% acetonitrile (ACN) and loaded onto a 4.6 mm x 250 mm RP Zorbax 300 A Extend-C18 column with 3.5-mm size beads (Agilent).
Collected fractions were concatenated into 24 fractions as described previously 10 ; 5% of each of the 24 fractions was aliquoted for global proteomic analysis, dried down in a Speed-Vac, and resuspended in 3% ACN, 0.1% formic acid prior to ESI-LC-MS/MS analysis. The remaining sample was utilized for phosphopeptide enrichment.

Enrichment of Phosphopeptides by Fe-IMAC
The remaining 95% of the fractions were further concatenated into 12 fractions prior to phosphopeptide enrichment using immobilized metal affinity chromatography (IMAC) as previously described 10 . In brief, Ni-NTA agarose beads were utilized to prepare Fe 3+ -NTA agarose beads, and then about 200 µg of peptides of each fraction reconstituted in 80% ACN/0.1% trifluoroacetic acid were incubated with 10 µL of the Fe 3+ -IMAC beads for 30 mins. Samples were then spun down, and the supernatant containing unbound peptides was removed. The beads were brought up in 80% ACN, 0.1% trifluoroacetic acid and then loaded onto equilibrated C-18 Stage Tips, and washed by 80% ACN, 0.1% trifluoroacetic acid, rinsed twice with 1% formic acid, followed by sample elution off the Fe 3+ -IMAC beads with 100 µL of 500 mM dibasic potassium phosphate, pH 7.0. C-18 Stage Tips were then washed twice with 1% formic acid, followed by elution of the phosphopeptides from the C-18 Stage Tips with 80 µl of 50% ACN, 0.1% formic acid twice. Samples were dried down and resuspended in 3% ACN, 0.1% formic acid prior to ESI-

ESI-LC-MS/MS for Global Proteome and Phosphoproteome Analysis
The global proteome and phosphoproteome fractions were analyzed as described in a previous study 9 . Peptides (~0.8 µg) were separated on an Easy nLC 1200 UHPLC system (Thermo

CiVIC Drug Matching
We obtained evidence of expression-based response to drugs from CIViC 11 . We filtered the database for only sensitive results and positive direction (i.e., "expression" and "overexpression").
We then matched these annotations to upregulated DEGs in our comparison groups. In the present study we used the 06/02/2020 nightly clinical evidence summary annotations.

Copy Number
The somatic copy number alterations (CNAs) are predicted using CNVkit (v0.9.6) 12 . For paired tumor-normal samples, the matched normal was used as the reference to determine CNAs of the tumor sample. Low quality CNAs were filtered based on the coverage (<20), the number of probes (<10), and length (< 10 kb). To define copy numbers from CNVkit, the thresholds are as follows: -t -1.1, -0.2, 0.2, 0.7. Due to low purity of the sample set, pathologist purity estimates were provided (-U) for copy number calling along with the clonal filter (-m).

ESTIMATE Immune and Stroma Scores
Scores reflecting the overall immune and stromal infiltration and tumor purity estimation were calculated by the R package ESTIMATE 14 using the normalized RNA expression data (FPKM-UQ). The ESTIMATE algorithm is based on single-sample gene set enrichment analysis and generates three scores: 1) stromal score (which captures the presence of stroma in tumor tissue), 2) immune score (which represents the infiltration of immune cells in tumor tissue), and 3) estimate score (which infers tumor purity).

Immunofluorescence Staining of FFPE Slides
FFPE samples were first cut into 4-μm tissue sections and heated at 55°C for 5 hours prior to staining. Then, deparaffinization and rehydration was performed via incubation subsequently in xylene, 100%, 95%, 70%, 50%, and 25% ethanol. After another 2-minute wash in ddH2O, antigen retrieval was performed in a hot-water bath using a 10mM sodium citrate buffer at 80-90°C for 22 minutes. The slides were then cooled down to room temperature. After that, glycine blocking was then performed using 100mM glycine, followed by two washes in PBST. Sections were then circled with a PAP pen, and blocked in the blocking buffer (10% normal donkey serum in 1% BSA) for one hour. Next, primary antibodies incubated on the tissues at 4°C overnight. The next day,

Immune Clustering Using Cell Type Enrichment Scores
The abundance of each cell type was inferred by the xCell web tool 16 , which performed the cell type enrichment analysis from gene expression data for 64 immune and stromal cell types (default xCell signature). xCell is a gene signatures-based method learned from thousands of pure cell types from various sources. We used the FPKM-UQ expression matrix as the input to xCell. xCell generated an immune score per sample that integrates the enrichment scores of various cell types, including B cells, CD4+ T-cells, CD8+ T-cells, DC, eosinophils, macrophages, monocytes, mast cells, neutrophils, and NK cells. Immune subtypes of HTAN PDAC cohorts were generated based on the consensus clustering of the xCell cell type enrichment scores (Wilkerson and Hayes, 2010). Among the 64 cell types tested in xCell, we selected cell types with at least 2 samples with xCell enrichment p < 0.01 and performed the consensus immune clustering based on the z-score normalized xCell enrichment scores. The consensus clustering was determined by the R package ConsensusClusterPlus (parameters: reps = 2000, pItem = 0.9, pFeature = 0.9, clusterAlg = "kmdist", distance = "spearman").

Mutation Impact on the Proteome and Phosphoproteome
We used an aggregated database of interacting proteins that combines Omnipath, DEPOD, CORUM, Signor2, and Reactome databases as previously described 17 . We focused our analyses on PDAC SMGs previously reported in the literature, but only KRAS and TP53 had large enough numbers in each comparison group for sufficient statistical power 18 . For each interacting pair, we split samples with and without mutations in partner A and compared expression levels (both protein and phosphosites) both in cis (partner A) and in trans (partner B). We calculated the median difference in expression and tested for significance using two-sided Wilcoxon rank sum tests. We further refined the list of trans interactions by filtering proteins that are not part of oncogenic processes identified in TCGA 19 .

KRAS Phosphosignaling Analysis
Oncogenic KRAS signaling in PDAC is believed to pass through three major pathways: Raf/Mek/Erk, PI3K/Pdk1/Akt, and the Ral guanine nucleotide exchange factor pathway 20 . We focused on the Raf/Mek/Erk pathway (along with PI3K/Pdk1/Akt) because its signaling is controlled by phosphorylation. We used the phosphosites identified in this pathway as detailed in Collisson et al 21 .

Differential Proteogenomic Analysis
For differential analysis between groups of samples using bulk data (gene expression, proteomics, and phosphoproteomics), we used two-sided Wilcoxon rank-sum tests to test for differential abundances of genes, proteins, and phosphosites. We required that at least 50% of samples in each comparison group have non-missing values. P-values were then adjusted using the Benjamini-Hochberg multiple test correction to obtain features with an FDR cutoff ≥ 0.05.

H&E staining
FFPE samples were cut into 4-μm tissue sections and heated at 55°C prior to staining. Then, deparaffinization and rehydration was performed via incubation subsequently in xylene, 100%, 95%, 70%, 50%, and 25% ethanol. Then, slides were stained with hematoxylin for 6 minutes followed by washing with tap water for three times and bluing in Scott's Tap Water gently for 30 seconds. After that, tissues were counterstained with eosin for 30 seconds followed by washing with tap water. Then, the stained tissues were dried overnight at room temperature. The next day, sections were dehydrated with 100% ethanol and xylene. Finally, slides were sealed with nailpolish and images were taken using Leica DMi8 microscope .