Topographic mapping of the glioblastoma proteome reveals a triple-axis model of intra-tumoral heterogeneity

Glioblastoma is an aggressive form of brain cancer with well-established patterns of intra-tumoral heterogeneity implicated in treatment resistance and progression. While regional and single cell transcriptomic variations of glioblastoma have been recently resolved, downstream phenotype-level proteomic programs have yet to be assigned across glioblastoma’s hallmark histomorphologic niches. Here, we leverage mass spectrometry to spatially align abundance levels of 4,794 proteins to distinct histologic patterns across 20 patients and propose diverse molecular programs operational within these regional tumor compartments. Using machine learning, we overlay concordant transcriptional information, and define two distinct proteogenomic programs, MYC- and KRAS-axis hereon, that cooperate with hypoxia to produce a tri-dimensional model of intra-tumoral heterogeneity. Moreover, we highlight differential drug sensitivities and relative chemoresistance in glioblastoma cell lines with enhanced KRAS programs. Importantly, these pharmacological differences are less pronounced in transcriptional glioblastoma subgroups suggesting that this model may provide insights for targeting heterogeneity and overcoming therapy resistance.

Major comments: 1. Due to the limited sample size, binary classification of glioblastoma neoplastic cells into KRAS-or MYC-activated class is unclear. IDH wildtype GBMs are typically classified into three (or four) transcriptomic subtypes, including proneural, classical, and mesenchymal subtypes. The IVYGAP paper (Puchalski et al. Science 2018) claimed via inference that CT and PAN contributes the majority of the bulk RNA-seq outcomes (Fig S8 of IVY GAP paper). This suggest that, if there is sufficient concordance between mRNA and protein levels, the authors might expect at least three (proneural-like, classicallike, mesenchymal-like) proteomic subtypes. Why did the authors's main signatures not include the classical-like subtype? Is this just a sample size issue (number of CT and PAN = 34), a lack of mRNA and protein correlation for genes determining the classical subtype, or overfitting of the prior classification? 2. Unless the authors have clear experimental evidence, naming the two subgroups as MYC-or KRAStarget enriched subgroups, could mislead people, since KRAS mutation is not commonly found in GBM. Instead, proneural-like and mesenchymal-like may better represent the two subtypes. 3. The authors assessed the clinical utility of the two subtypes two-fold: estimating differences in prognosis and deriving subtype-specific treatments. They claimed that the KRAS class has worse prognosis based on a niche-specific inference of the TCGA GBM dataset. However, the mesenchymal subtype of GBM has been shown to have worse prognosis than other transcriptome subtypes (Wang et al. Cancer Cell 2017). As their KRAS-target enriched class is associated with GBM mesenchymal features, it is not surprising that they show worse prognosis. Pharmacological analysis of the two major classes is limited to computational analysis of public data originally obtained in 2D culture conditions with physiologically discordant GBM cell line models. The authors should demonstrate at least one novel clinical utility of the classification to be suitable for Nature Communications. 4. Pharmacological profiling of GSC cells against 188 kinase inhibitors is somewhat off the table. It would be more relevant if the proteomic class (KRAS or MYC) for GSC cells is determined first and show whether any of drugs have selective cytotoxicity in the physiologically relevant condition (hypoxia). However, the results indicated that none of the tested drugs had cytotoxicity in hypoxic conditions. 5. Methods how the authors identified the hypoxia axis is not described in detail. Also, the biological and clinical importance of the axis is not sufficiently explored in the manuscript, except for the abovementioned drug screening data with the GSC model. To claim this as a main axis, further evidence might be needed to support its significance. 6. Justification of including the "LEUKO" dataset for the niche-specific gene signature inference should be addressed in more detail. Stating simply "to act as a surrogate for inflammatory infiltrate" in the Methods is insufficient, since some anatomical regions, particularly MVP, might include signals from immune infiltrates, which might affect the accuracy of the ML models.
Minor comments: 1. In Fig. 2C, it is hard to discern point colors. Use more clear color scales. 2. In Fig. 4B, Cluster 1 and Cluster 2 labels should be moved to columns. 3. 3D plots in Fig. 4D and 4E are very confusing. Please use a different display method, such as 2D + color code. 4. Fig. 4D and 4E seem to be switched or mislabeled. 5. In Fig. 5C, "IvyGap" is underlined in red. Fix this. 6. Fig. 5E and 5F seem to be switched or mislabeled. 7. In Fig. 5E and 5F, please describe how the patients were divided into two groups. 8. Enlarge font sizes for labels in Fig. 6D and other figures. 9. In page 7, line #10, "clinical outcomes of glioma patients from TCGA." Is this glioma or GBM? 10. Software versions and database versions are missing in the Methods section. 11. Cannot find files or links for computer codes and R scripts used to generate the results.

Reviewer #3 (Remarks to the Author): Expert in glioblastoma genomics
This study represents an important resource for the community and the authors should be thanked for performing regionally-resolved proteomic analysis in GBM and making the data available to the community. I have minor comments.
-colors in figure 1C are very hard to distinguish; a different color scheme should be used; -figure 3F is overstained; -there are numerous survival analyses throughout the study and it is sometimes unclear what the message is; in the first part of the study, the analysis is focused on defining regionally-resolved protein-expression programs; but then typically only 1 protein is shown/chosen to correlate with survival; did other genes in e.g. PAN (other than AKAP12) also correlate with survival? (i.e. is it the abundance of PAN that correlated or is this specific to AKAP12); a similar question would be relevant for CD276 and MVP; is it MVP-globally or CD276-specific? It also seems that CD276 is quite highly expressed in PAN and it would be important to show the staining pattern in such areas as well? -figure 5F: the survival analysis seems to not replicate so well in CPTAC dataset; would be important to explain discrepancies/again downplay clinical relevance if needed; -the title/chosen nomenclature around KRAS and MYC is a somewhat simplification; other programs are part of those signatures (e.g. KRAS with mesenchymal and MYC with proneural); so one could also chose to use this "more established" GBM nomenclature; a rationale for focusing on KRAS and MYC naming/subprograms should be provided and the survival analyses should be done with (or compared to) the entire programs, not subprograms; -In general, this reviewer feels that showing correlation to survival is ok, but many many markers have been suggested in GBM and tend to end up not being that useful; so a resource-focused paper, rather than too many survival curves would be preferable;

Point by Point response to reviewer comments
We thank reviewers for your efforts, time and valuable feedback during the challenging times with COVID-19 and hope you and your families are keeping safe. We have now had a chance to review your comments and make all the recommended changes including expanding our analytical interruptions of our triple axis model. We also include additional analyses to highlight clinical utility of our findings and carrying out the additional recommended analyses.
Specifically, the revised manuscript addresses, among others, the following major three aspects raised: (i) quantification of immune content (using CIBERSORT) to characterize the immune differences across tissue niches, (ii) addition of single-cell RNA expression that provides further support of our triple axis model, and (iii) justification of the use of the gene signatures of our model (KRAS_TARGETS, MYC_TARGETS and HYPOXIA) instead of the more commonly used classification system by Verhaak (classical, mesenchymal and proneural).
Please find point-by-point responses to your comments and suggestions below. Be believe your insights have substantially improved the quality of our work and we thank you for this constructive feedback.

Reviewer 1:
Overall, this paper provided interesting molecular profile of intra-tumoral heterogeneity by regionally isolated and defined tissue proteomics. This approach is unique and powerful to show the relationship between tumor microenvironment (TME) and tumor progression. Moreover, by using machine learning method, they also demonstrated new topographic map. This map demonstrates that MYC-and KRAS-axis cooperate with hypoxia to produce three-dimensional model for intratumoral heterogeneity and provide clinical relevance for chemoresistance in GBM. Although, their approaches are novel and interesting, they did not provide enough evidence to strongly support their conclusions. Moreover, there is not much progress that provided by proteomics results compared with previous transcriptomic analysis. The submission could be significantly strengthened by considering the following changes:

Comment R1.1: "In this paper, authors collected different anatomical niches by laser capture microscopy (LCM) to excise regions of IT, CT, PAN, and MVP by benchmarking IVY GAP. However, these samples might have different TME proportions and variable tumor purity. How did they normalize these differences among patient samples?"
Response R1.1: Thank you for your interest in our work and your comments for improvements. We used specific selection criteria for each anatomical niche based on hallmark histomorphologic features comparable to the existing IVY GAP GBM Atlas strategy. We now add the criteria used explicitly for clarity (Methods, page 17, Lines 9-19). To ensure this would be a valuable study and resource, we reviewed over one hundred cases and selected only those in which at least 3 regions were well represented and had the classic and immediately recognizable histologic patterns. These were reviewed independently by two board certified neuropathologists and the cases were included only upon their unanimous agreement. To further increase transparency of each patient sample, we now include representative H&E images before and after LCM and present their respective spatial resolution within a PCA plot (Supplemental figures 2-25).
New Text in the Methods (Page 15, Lines 9-19): "TME differences across patient samples were normalized using the same selection criteria as IvyGAP. Leading Edge (LE) is the outermost boundary of the tumor, where the ratio of tumor to normal cells is about 1-3/100, and the laminar architecture of the cortical layers is frequently evident. Infiltrating Tumor (IT) is the intermediate zone between the Leading Edge (LE) and Cellular Tumor (CT), and is frequently marked by perineuronal satellitosis. Cellular Tumor (CT) constitutes the major part of the tumor core, where the ratio of tumor cells to normal cells is between 100/1 to 500/1. Pseudopalisading Cells around Necrosis (PAN) is the narrow boundary of cells arranged like pseudopalisades along the perimeter of necrosis. Microvascular Proliferation (MVP) refers to two or more blood vessels sharing a common vessel wall of endothelial and smooth muscle cells arranged in the shape of a glomerulus or garland of multiple interconnected blood vessels." Comment R1.2: "By proteomic analysis from FFPE samples, authors identified total 4,794 proteins across the entire sample cohort. However, heterogeneity of tissue composition might contribute to loss of common proteins that were identified and quantified entire set of proteomic analysis. How many proteins were commonly identified and quantified entire samples and how did they quantify missing values? Among different anatomical niches, identified number of proteins are extremely variable. Can they compare statistic power between proteomics and transcriptomics?" Response R1.2: We acknowledge that biological heterogeneity in tissue composition is expected to lead to differences in identified and quantified proteins. To address this comment, we now include numbers of commonly identified proteins within each anatomical niche in a We also highlight that the enrichment of CD276 within vasculature of other cancers and note anti-CD276 therapies (NIH) (Please see Page 14 lines 3-15). To identify proteomic immune signatures we perform GSEA to identify immune processes enriched in MVP. We now highlight a number of pathways including allograft rejection and neutrophil activity in the modified figure (Please see Fig. 2e). Comment R1.4: To extract functionally relevant gene sets at the individual sample level, independent of regional groups, they did hierarchical clustering of purified tumor tissue regions (CT and PAN samples). Is there any reason why they exclude other regions? How they control contamination or purity issue among different samples? Response R1.4: Thank you for your comments. For clarification, we chose to focus on only CT and PAN for downstream analysis, as these regions contained the highest proportion of pure tumor cells (nearly 100% by histology). Conversely, given their definition, MVP, IT, and LE samples contain a significant proportion of non-neoplastic cellular elements. By utilizing only CT and PAN samples for downstream analysis, we believe we could sufficiently overcome contamination of the most relevant non-neoplastic cell types (e.g. endothelial cells, normal brain tissue elements). For sample purity, we utilized pathologistdefined regions as defined by the IvyGAP study (Please see Supplementary Figs. 2-25). As mentioned above, we screened a large number of cases to control for niche variations across patient samples and created the presented high-quality cohort for our regional analysis.
New text included (Page 8, Lines 21-24): "For this analysis, we chose to focus on only CT and PAN, as these regions contained the highest proportion of pure tumor cells (nearly 100% by histology) compared to other regions like MVP, IT, and LE that contained a significant proportion of non-neoplastic cellular elements that could confound the analysis."  Fig. 30). We now provide a more detailed explanation behind our rationale for building a GBM disease model based on the expression status of the KRAS and MYC targets instead of using the GBM classification system proposed by Verhaak (See Fig. 4G). This receiver operator curve (ROC) analysis of our samples using either the Verhaak signature sets (0.91 and 0.82 for mesenchymal and proneural, respectively) and our MYC and KRAS signatures (0.95 and 0.88 respectively) reveals better performance of our model in classifying our simples into distinct clusters. Furthermore, we compare p-value rankings of different MsigDB modules for both of the Verhaak gene sets and those of KRAS and MYC demonstrating improved compactness of the molecular modules of our proposed axes ( Supplementary Fig. 31). This latter feature turned out to be quite important when comparing drug sensitivity differences of 31 GBM cell lines later in the paper as much fewer axis-specific agents were identified using the traditional transcriptionally-defined subgroups. These findings are described in the modified manuscript as follows: New text included (Page 9, Lines 14-23): "Of note, the Verhaak_GBM_classical signature did not appear in our set of 64 proteogenomic signatures due to low protein/RNA concordance (robust linear fit's p=0.059). Notably, both this signature and the neural subgroup (which did have good protein-RNA correlation) did not meaningfully correlate with the two defined protein-based clusters. Given the aforementioned proteogenomic discordances in cancer programs and contextual differences (e.g. bulk RNA profiling), we used the area under the receiver operator curve (AUC, ROC) to define the nomenclature of our spatially-profiled and protein clusters (Fig. 4g). Importantly, in our proteomic datasets, MYC and KRAS gene set-driven clustering yields more robust sample separation than Verhaak Mesenchymal and Proneural gene sets (Supplementary Fig. 30)." Comment R1.6: In Fig 4D, and E, authors included third axis as "Hypoxia" molecular signature. However, hypoxia signature seems to be more associated PAN regions than CT regions. Is there any clear reason why hypoxia signature is most important third axis compare to other signatures? Response R1.6: We appreciate this comment regarding the third axis of our model. For clarification, our unsupervised analysis yielded two major clusters/axes (MYC and KRAS-related clusters). The intermixing of PAN and CT however was somewhat surprising to us given the visible histopathologic pattern differences of these regions. As a quality control and additional separation layer, we therefore wanted to see how the hypoxia signature further segregates the two defined clusters. Remarkably, PAN regions could be reliably differentiated from CT regions using this signature (See ROC analysis in Fig. 4g). We therefore felt this was an important third axis to incorporate into our model because it provided a strong account for the known hypoxic biology presumed to exist in PAN regions and highlighted the robustness of our niche selection and analytical pipeline. We now better clarify the motivation for this third axis in the revised manuscript as follows: New text included (Page 9, Lines 5-13). "Closer inspection of signature enrichment patterns revealed an inverse correlation between the KRAS targets and MYC targets signatures (Fig. 4c), and, despite being intermixed between CT and PAN regions, these anatomical coordinates appeared to be relatively further faithfully segregated across a third "hypoxia" molecular signature axis that was also one of the concordant proteogenomic programs defined by our analysis (Fig. 4d). This triple axis of separation was validated by interrogation of the independent IVY gap transcriptional atlas (Fig. 4e), and confirming, within the CT samples, that high KRAS targets activity is associated with invasion and epithelial-tomesenchymal transition processes whereas samples enriched for the MYC axis were associated with cell cycle progression (Fig. 4f)." Response R1.7: We appreciate the comment. We note that both the KRAS and MYC signatures were derived from highly pure tumor regions and not necessarily dependant on specific extrinsic tumor microenvironments. The pharmacoproteomic analysis was used to highlight the potential significance of the model we proposed. This was done by exploring transcriptional patterns of 33 different GBMs in the CCLE resource (Broad Institute) and ranking their expression profiles along the different axes. This allowed us to compare influence of KRAS-and MYC-signatures across multiple GBMs in the same culture condition.

Comment R1.8: Authors identified key signaling axes, KRAS-and MYC-signatures, that can potentially pharmacologically profile GBM patients. Can they also apply this signature to single cell analysis datasets that was already available in other cohorts?
Response R1.8: To addresss your comment, we now explore our proposed axes in single cell data from a recent publication as a complementary validation analysis (Richards et al., Nature Cancer 2021). This data is now included (Fig. 4h-j and Supplementary Fig. 33). Specifically, we show that transcriptional single cell profiles of GBMs show variable expression profiles that can be scored along the KRAS and MYC enrichment axis and that these axes are mutually exclusive to one another. Moreover, the major biological processes we found to be enriched along these 2 axes are also concordant in this single cell analysis, further strengthening our proposed model. Lastly, this new data further supports and validates that the axis signals are driven from tumor cell programs and not contaminated from other non-neoplastic TME components in the regions. This new analysis is now described in the text.
New text included (Page 10, Lines 6-18): "To further characterize intra-tumour variability, the enrichment level of the KRAS_TARGETS and MYC_TARGETS signatures was assessed in a group of samples at the single-cell level by using scRNA data from GBM cells isolated from bulk tumour samples 44 . The GBM cells were grouped as KRAS_TARGETS-high, MYC_TARGETS-high, together with a "Central group" that shows no elevated enrichment in either signature (Fig. 4h and Supplementary Fig. 33). Cells highly enriched for both signatures are nearly non-existent, supporting the observed mutually exclusive feature of these cellular phenotypes. The MYC_TARGETS-high group presented high enrichment of the HALLMARK_MITOTIC_SPINDLE signature (Fig. 4i) whereas KRAS_TARGETS-high cells were enriched in WU_CELL_MIGRATION, a signature that is associated with tumour invasiveness (Fig. 4j). These results support a model where a large proportion of GBM cells within a tumour tissue appear to be "oncogenetically stable", with certain subpopulations exhibiting "oncogenic activation" towards either invasive or proliferative phenotypes."

Minor Comments:
Comment R1.9: In Fig 4D, Fig 5E, and F The main finding of this study is the identification of two distinct neoplastic cell classes, MYC-and KRAS-targets activated classes, wherein hypoxia signature serves as an orthogonal axis with which to further distinguish each class to PAN and CT. Using machine learning regression model-based inference of CT-niche specific signatures by mining public bulk RNA-seq data, they showed patients with an enhanced KRAS-signature exhibit poor prognosis. Likewise, by stratifying GBM cell lines in the public pharmacogenomics dataset into KRAS-and MYC-enriched subgroups, they found that the KRAS-subgroup displays chemoresistance.

and E, description on figures is different from the description in legend and manuscript. In
While the manuscript is interesting and clearly patient relevant, the study novelty is somewhat incremental, and the conclusions are not yet definitive. Below are some suggestions for the authors to consider to further improve their manuscript.

Comment R2.1: Due to the limited sample size, binary classification of glioblastoma neoplastic cells into KRAS-or MYC-activated class is unclear. IDH wildtype GBMs are typically classified into three (or four) transcriptomic subtypes, including proneural, classical, and mesenchymal subtypes. The IVYGAP paper (Puchalski et al. Science 2018) claimed via inference that CT and PAN
contributes the majority of the bulk RNA-seq outcomes (Fig S8 of IVY GAP paper). This suggest that, if there is sufficient concordance between mRNA and protein levels, the authors might expect at least three (proneural-like, classical-like, mesenchymal-like) proteomic subtypes. Why did the authors's main signatures not include the classical-like subtype? Is this just a sample size issue (number of CT and PAN = 34), a lack of mRNA and protein correlation for genes determining the classical subtype, or overfitting of the prior classification?
Response R2.1: Thank you for raising this important point. In short, the classic-like signature was excluded due to a lack of significant mRNA to protein correlation. Our final cohort, while relatively small, was generated from a much larger number of cases after carefully selecting for relatively large GBM resections in which the majority of niches could be objectively identified and sufficiently isolated in a meaningful manner. By using more than 5,000 gene signatures of the MSigDB as the starting point, we carried out a computational process to select those that are informative in GBM at the protein level. This was largely driven by identifying gene sets with strong protein/RNA concordance. This led to a datadriven nomination of 64 signatures, which were then applied to our dataset. From the 4 Verhaak classes, the Proneural, Mesenchymal and Neural classes met all the selection criteria and were part of the set of 64 signatures. As you know, the relevance of the neural subgroup has recently been challenged, and in our analysis, also did not appear to be specific to either of our two defined clusters. The Verhaak classical signature was excluded due to poor protein/RNA concordance (robust linear fit's p=0.059). We now include a statement in the manuscript providing our reasoning for not including the classical signature (around page 9, lines 14-15). For completeness, we also include an ROC analysis for the classical signature to also show it did not meaningfully correlate with the defined proteomic clusters (Fig 4G and  also Fig. 4g and Supplemental Fig. S30), at least at the proteomic level. We further clarified the statistical and contextual motivation for choosing the proposed nomenclature over those found in bulk-based genomic studies in the response to Reviewer 1, Comment R1.5 and new section in the text below.
New text included (Page 9, Lines 14-30): "Of note, the Verhaak_GBM_classical signature did not appear in our set of 64 proteogenomic signatures due to low protein/RNA concordance (robust linear fit's p=0.059). Notably, both this signature and the neural subgroup (which did have good protein-RNA correlation) did not meaningfully correlate with the two defined protein-based clusters. Given the aforementioned proteogenomic discordances in cancer programs and contextual differences (e.g. bulk RNA profiling), we used the area under the receiver operator curve (AUC, ROC) to define the nomenclature of our spatially-profiled and protein clusters (Fig.  4g). Importantly, in our proteomic datasets, MYC and KRAS gene set-driven clustering yields more robust sample separation than Verhaak Mesenchymal and Proneural gene sets (Supplementary Fig. 30)." Comment R2.3: The authors assessed the clinical utility of the two subtypes two-fold: estimating differences in prognosis and deriving subtype-specific treatments. They claimed that the KRAS class has worse prognosis based on a niche-specific inference of the TCGA GBM dataset. However, the mesenchymal subtype of GBM has been shown to have worse prognosis than other transcriptome subtypes (Wang et al. Cancer Cell 2017). As their KRAS-target enriched class is associated with GBM mesenchymal features, it is not surprising that they show worse prognosis. Pharmacological analysis of the two major classes is limited to computational analysis of public data originally obtained in 2D culture conditions with physiologically discordant GBM cell line models. The authors should demonstrate at least one novel clinical utility of the classification to be suitable for Nature Communications.

Response R2.3:
We agree that other gene expression signatures have been proposed to predict disease outcome in GBM (including the mesenchymal signature). We note however the mentioned gene signature is quite large and span multiple gene modules and a large array of genes that make its utility potentially less valuable for clinical implementation and therapeutic applications. The KRAS signature we propose, in addition to better representing our generated proteomic dataset, is also more compact (Supplementary Supplementary Fig. 31). We believe showing a survival difference with a "cleaner" signature can therefore be seen as progress in classification as it may be subject to less overfitting than the mesenchymal signature. We now further discuss important differences between KRAS and mesenchymal signatures and believe they provide insights into defining more refined phenotypic models of how glioblastoma heterogeneity could be conceptualized (See Response to Reviewer 1, Comment R1.5).
Despite the stated limitations of our pharmacological analysis, we feel it provides relevant information of how these axes may be contributing to resistance. As clinical management of cancer progresses towards more specific and precise molecular therapies, having a refined model of inter-and intra-tumoral heterogeneity of GBM at the proteomic level is of biological relevance with future clinical utility. To further demonstrate the clinical utility of these refined signatures using the existing pharmacogenomic signatures, we show that doing the same analysis with these potentially less specific mesenchymal and proneural signatures leads to fewer significant pharmacological differences across these different molecular axes. We believe highlight the superior pharmacological sensitivity predictive power of our molecular programs to be a relevant innovation for the GBM field.

Comment R2.4: Pharmacological profiling of GSC cells against 188 kinase inhibitors is somewhat off the table. It would be more relevant if the proteomic class (KRAS or MYC) for GSC cells is
determined first and show whether any of drugs have selective cytotoxicity in the physiologically relevant condition (hypoxia). However, the results indicated that none of the tested drugs had cytotoxicity in hypoxic conditions.