Identification of immune-related subtypes of colorectal cancer to improve antitumor immunotherapy

Immunotherapy involving immune checkpoint inhibitors (ICIs) for enhancing immune system activation is promising for tumor management. However, the patients’ responses to ICIs are different. Here, we applied a non-negative matrix factorization algorithm to establish a robust immune molecular classification system for colorectal cancer (CRC). We obtained data of 1503 CRC patients (training cohort: 488 from The Cancer Genome Atlas; validation cohort: 1015 from the Gene Expression Omnibus). In the training cohort, 42.8% of patients who exhibited significantly higher immunocyte infiltration and enrichment of immune response-associated signatures were subdivided into immune classes. Within the immune class, 53.1% of patients were associated with a worse overall prognosis and belonged to the immune-suppressed subclass, characterized by the activation of stroma-related signatures, genes, immune-suppressive cells, and signaling. The remaining immune class patients belonged to the immune-activated subclass, which was associated with a better prognosis and response to anti-PD-1 therapy. Immune-related subtypes were associated with different copy number alterations, tumor-infiltrating lymphocyte enrichment, PD-1/PD-L1 expression, mutation landscape, and cancer stemness. These results were validated in patients with microsatellite instable CRC. We described a novel immune-related class of CRC, which may be used for selecting candidate patients with CRC for immunotherapy and tailoring optimal immunotherapeutic treatment.


Methods
Patient selection. A total of 1503 patients with CRC, whose gene expression profiles and clinicopathological characteristics were available, were selected; the relevant data were obtained from public databases. The Cancer Genome Atlas-Colon Cancer (TCGA-COAD) and The Cancer Genome Atlas-Rectal Cancer (TCGA-READ) datasets were downloaded from UCSC XENA (http:// xena. ucsc. edu/) and used as the training cohort. Only patients diagnosed with adenocarcinoma (n = 488) were selected for the subsequent analysis. Three additional CRC datasets with the required clinical information were obtained from the Gene Expression Omnibus (GEO) database (http:// www. ncbi. nlm. nih. gov/ geo/) and used as an external validation cohort. The three cohorts-GSE39582, GSE14333, and GSE17538-comprised 1015 patients.

Extraction of immune expression pattern.
We used the NMF package (v0. 23.0) to perform microdissection of the mRNA expression profiles in the TCGA-COAD and TCGA-READ datasets. After testing, it was found that K = 10 was the number of factors that could be applied for good segmentation of TCGA data. To extract data on immune-related NMF factors, ESTIMATE was used to analyze the immune enrichment score in TCGA samples. Statistical analysis revealed that among all NMF factor groups, patients whose data were decomposed into the seventh factor had a higher immune enrichment score; hence, the seventh factor was defined as an "immune factor. " According to the difference between the load value of the seventh factor and the maximum load of other factors, all genes were sorted from high to low, and the genes with the top 150 weights were considered to be the key factors distinguishing immune and non-immune classes. This classification was achieved through the utilization of the NMF Consensus command of the NMF package with top 150 genes, which was further refined using the multidimensional scaling (MDS) random forest method of the randomForest (v4. [6][7][8][9][10][11][12][13][14][15][16] package. The clusterProfiler (v3.14.3) package was used to analyze the function of the 150 immune-related genes.
Further classification of immune classes. Nearest template prediction (NTP) (CMScaller_0.99.2 package) and the single-sample Gene Set Enrichment Analysis (ssGSEA) method in the Gene Set Variation Analysis (GSVA) (v1. 34.0) package were used to evaluate gene expression signatures and enrichment of molecular pathways that represented different inflammatory states or immune cells. Through NTP of stroma activation, the immune class data were further bifurcated into immune-suppressed and immune-activated subtypes. DESeq2 software was used to select genes that were significantly differentially expressed between the immune and nonimmune groups with p adj < 0.05 and absolute value of log2 fold change (FC) > 1. To determine the gene sets and pathways enriched in the immune and non-immune groups, the clusterProfiler (v3.14.3) package was used to conduct Kyoto Encyclopedia of Genes and Genomes and Gene Ontology pathway functional enrichment analysis of the differentially expressed genes (DEGs); simultaneously, all genes were enriched through GSEA using the fgsea (v1.12.0) package.
Validation of immune subtypes in the validation cohort. We screened the top 150 dysregulated genes between the immune and non-immune classes. Similar to the training cohort, the NMF and the ESITI-MAT algorithms were adopted to separate the validation cohort samples into immune and non-immune classes based on the expression of the selected 150 genes. Furthermore, NTP was used to further divide the immune group into immune-suppressed and immune-activated subtypes.
Prediction of response to immunotherapy. To predict the responses to ICI therapy in patients with CRC, tumor immune dysfunction and exclusion (TIDE) and SubMap analysis (GenePattern module "SubMap") were used. SubMap is an algorithm used to assess similarities in gene expression between previously defined Correlation of immune class with copy number alterations, mRNA stemness index, neoantigens, TILs, TMB, and mutational genes. Copy number alterations (CNAs) were calculated using GIS-TIC2.0 from GDAC Firehose (https:// gdac. broad insti tute. org). A statistical comparison was then conducted to determine the difference between the immune and non-immune groups in amplification or deletion events at the arm and focal levels. Previously, Rooney et al. calculated the neoantigen number of TCGA tumors, from which the neoantigen number of TCGA-COAD and TCGA-READ could be obtained 34 . Furthermore, Saltz et al. estimated the abundance of TILs based on the images of hematoxylin and eosin (H&E)-stained sections of 13 TCGA tumor types, which included data on CRC 35 . Mutation data were collected from TCGA (https:// tcga-data. nci. nih. gov), and the TMB was evaluated using the maftools (v2.6.05) package. To obtain data on the significantly mutated cancer genes (p < 0.01), we used the MutSigCV (v1.41) package to analyze the mutation data. Independent tests were then performed to further determine the mutations with significant differences among the different groups (p < 0.05). Finally, the mutation landscape oncoprint was generated using maftools. The mRNA stemness index (mRNAsi) was analyzed through one-class logistic regression, which can help predict stemness in poorly differentiated tumors.
Statistical analysis. Unless otherwise noted, all analyses were conducted using R software (v3.6.1), and the significance level was set at p < 0.05. Normally distributed data were analyzed using the Student's t-test and analysis of variance, whereas non-normally distributed data were analyzed using the Wilcoxon and Kruskal-Wallis tests. For the analysis of categorical variables, the Fisher's exact and Pearson's chi-square tests were used (ns p > 0.05, *p < 0.05, **p < 0.01, ***p < 0.001, or ****p < 0.0001). Kaplan-Meier analysis and the log-rank test were conducted to compare patients' survival between different immunophenotypes.

Results
Identification of a novel immune-related subtype of CRC . To establish an immune-associated molecular classification of CRC, we selected 488 patients from TCGA for a training cohort and 1015 patients from the GEO for a validation cohort, to dissect gene expression profiles using the NMF algorithm (Fig. 1). The omics sequencing data and complete clinical information of the patients included in this study are presented in Table 1. We first used the NMF algorithm to conduct a virtual microdissection of distinct gene expression patterns in the training cohort. Upon integration with the immune enrichment score 36 generated using the ESTI-MATE algorithm, the seventh of the 10 expression patterns (NMF factors) was found to be enriched in patients with high immune enrichment scores and exhibited a significantly higher immune enrichment score average than most of the remaining patterns ( Fig. 2a and Supplementary Fig. S1). Therefore, this pattern was considered as an immune factor. The top 150 weighted genes in the immune factor were defined as exemplar genes, which represented the immune factor expression pattern. We performed Gene Ontology enrichment analysis for the exemplar genes and observed that immune activation-associated pathways, such as humoral immune response, lymphocyte-mediated immunity, complement activation, and antigen binding, were significantly enriched (Supplementary Fig. S1), indicating the immune-related functions and signaling of the immune factor.
To further stratify immune-related CRC subtypes, consensus clustering based on the exemplar genes was conducted to obtain preliminary classes, and data for the classes were refined using an MDS random forest algorithm (Fig. 2b). This helped identify a novel immune molecular subtype that accounted for 42.8% of the cohort (209/488). This subtype was termed as the "immune class. " The remaining 57.2% (279/488) of the cohort was termed as the "non-immune class" (Fig. 2c). We characterized the function of the immune class using immune-related gene signatures derived from literature and databases (Supplementary Table S1) through GSVA analysis. The immune class showed significantly higher enrichment scores for immune-related signatures than the non-immune class, including immune infiltrating cell-related signatures (T cell, NK cell, B cell, macrophage), as well as cytotoxic effect-related signatures, such as tertiary lymphoid structure (TLS), interferon (IFN) signatures, and the cytolytic activity score (CYT) (all p < 0.05, Fig. 2d). Additionally, we compared the DEGs between the immune and non-immune classes and observed that these DEGs were significantly enriched in immuneactivated signatures, such as cytokine-cytokine receptor interaction, Th17 cell differentiation, Th1 and Th2 cell differentiation, and antigen binding ( Supplementary Fig. S2). We also compared the difference in enrichment signatures between immune and non-immune classes through GSEA, which revealed significantly upregulated immune response related-pathways in the immune class, such as the intestinal immune network for IgA production, cytokine-cytokine receptor interaction, inflammatory response, IFN gamma response, IFN alpha response, and tumor necrosis factor alpha signaling via NF-kB ( Supplementary Fig. S2). These results indicated that the immune class identified depicted evident characteristics of immune activation.
To evaluate the accuracy of the NMF algorithm-generated immune molecular classification, we integrated features of the established immune class with previously reported CRC molecular features. Thorsson et al. 37 established a method of pan-cancer immune classification (C1-C6) of solid tumors by conducting an extensive immunogenomic analysis of over 10,000 tumors derived from 33 distinct cancer types. We observed a significant enrichment of C2 (IFN-γ-dominant, 48/190 vs. 24/236, p < 0.01), and a significant decrease in C1 (wound healing, 130/190 vs. 201/236, p < 0.01), in the immune class compared with those in the non-immune class (Fig. 2d). Consensus molecular subtypes (CMSs; CMS1-CMS4) proposed by the CRC Subtyping Consortium are regarded as the most accurate classification for the stratification of CRC cases 38,39 . We observed a significant enrichment of CMS4 (mesenchymal subtype, 60/190 vs. 43/236, p < 0.01) and a significant decrease in CMS2 (canonical subtype, 62/190 vs. 120/236, p < 0.01) in the immune class compared with those in the non-immune www.nature.com/scientificreports/ class ( Fig. 2d). Collectively, these results suggest that the immune-related classes identified in this study through the NMF analysis may be used to stratify CRC based on the immune molecular characteristics.
Immune class is composed of immune-activated and immune-suppressed subtypes based on different tumor niches. The TME varies in components of immunocytes and cytokines in the tumor niche, which confer anti-and pro-tumor activities during cancer progression. Moffitt et al. proposed stromal activation as a signature for the categorization of pancreatic ductal adenocarcinoma into normal and activated stromal subtypes, which was successfully applied to provide decision support and tailor therapies in the clinic 40 .
To explore this component in the immune CRC class, we integrated NTP analysis by determining stromal activation signature. We identified 46.9% (98/209) of the patients in the immune class that lacked stromal enrichment, whereas the remaining 53.1% (111/209) had a relatively higher stromal enrichment scores (Fig. 3a). TGF-β is regarded as the pivotal regulator of immune suppression within the TME and has been reported to play roles in tumor immune escape and poor response to ICI immunotherapy 41,42 . Notably, we observed that multiple TGF-β signatures, such as WNT/TGF-β and TGF-β response signatures of fibroblasts (Fibroblast-TBRs), TGF-β response signatures of T cells (T cells-TBRs), late TGF-β, and extracellular matrix cytokines (C-ECM) were all significantly enriched in the stromal-activated class than those observed in the remainder of the immune class (all p < 0.05, Fig. 3a). We also observed that CMS4 subtypes and PD-1 signaling were higher enriched in the stromal-activated class than the remainder of the immune class (Fig. 3a). Therefore, we defined the stromalactivated class as the "immune-suppressed subclass" of the immune class. The remaining cases, which lacked the TGF-β and PD-1 signatures, were defined as the "immune-activated subclass. " Cytotoxic T cells, Th1 cells, and IFN-γ potently eliminate tumor cells, whereas Treg cells, cancer-associated fibroblasts (CAFs), and myeloid-derived suppressor cells (MDSCs) are regarded as immune-suppressive cells 43,44 . Interestingly, we observed that immunosuppressive Treg cells, tumor-infiltrating Treg (TITR) cells, MDSCs, and macrophages signatures were significantly upregulated in the immune-suppressed subtype (all p < 0.05, Fig. 3a), which confirmed the suppressive status of this subclass. Additionally, we observed that the expression of immune-suppressive genes, such as TGFβ1, TGFβ3, LGALS1, and CCL2, was significantly higher in the  www.nature.com/scientificreports/ immune-suppressed subclass than in the immune-activated subclass (Fig. 3b). Collectively, these results revealed that the immune class consisted of two distinct subtypes based on suppressive signaling.

Reoccurrence of the three immune-related subtypes in other independent cohorts.
To validate the recurrence of our established immune-related subtypes, we performed NMF analysis and activated stromal signature-based separation in three CRC cohorts (GSE17538, GSE14333, GSE39582, n = 1015; Table 1). The top 150 dysregulated genes identified between the immune and non-immune classes in the training cohort were considered as the immune classifiers. In the GSE17538 cohort, we identified 50.0% (116/232) of the patients who had a high immune enrichment score and belonged to the immune class, whereas the remaining 50.0% (116/232) were defined as the non-immune class. In the immune class, patients were further divided into immune-activated (40.5%, 47/116) and immune-suppressed (59.5%, 69/116) subclasses (Fig. 4). Similar to the results obtained for the training cohort, the immune class showed higher enrichment scores for the infiltrating immunocyte-related signatures, TLS, CYT, and IFN signatures, than the non-immune class. Accordingly, the immune-suppressed subclass displayed higher scores for stromal activation, TGF-β signatures, and immunesuppressive-associated signatures than the other two classifications. Similar results were also observed in the GSE14333 and GSE39582 cohorts. In GSE14333, 27 Fig. S4). Genes related to immune suppression were upregulated in the immune-suppressed subclass in all three validation cohorts ( Supplementary Fig. S5). Collectively, these results confirmed the stability and robustness of our immune-related CRC molecular classification system.
Immune-related subtypes associated with prognosis outcome and response to ICIs. To explore the clinical implications of this molecular classification, we first evaluated the different prognosis outcomes through survival analysis. The immune-activated subclass exhibited the best overall survival (OS) and disease-free survival (DFS) outcomes, the immune-suppressed subclass showed the poorest prognosis, and the non-immune class exhibited moderate prognosis outcomes in both training and validation cohorts (Fig. 5a). We performed the multivariate analysis to confirm the results of the survival evaluation. The results showed the immune-related classification is an independent factor of OS and DFS (Supplementary Table S2). We further explored the correlation between these groups and clinicopathological characteristics. No significant correlation    Table S3).
To assess the potential applicability of this immune-related molecular classification system in the prediction of response to ICI treatment, we used SubMap analysis to compare the gene expression pattern of our immunophenotypes with those obtained for patients with metastatic melanoma who were subjected to treatment with ICIs. Notably, patients with the immune-activated subclass exhibited a highly similar gene expression profile to melanoma patients who responded to anti-PD-1 immunotherapy in both training and validation cohorts (Fig. 5b). These results imply that patients in the immune-activated subclass may be potential candidates for receiving ICI immunotherapy. TIDE 45 , which is an effective computational method for predicting response to ICIs, was also used to evaluate the ability of our molecular classification to predict ICI response. We observed that the ratio of ICI response in the immune-activated subclass was higher than that in the other two groups, whereas the immune-suppressed subclass showed the worst ICI responses (Fig. 5c). Taken together, the immuneactivated subclass exhibits the best prognostic outcomes and patients grouped under this subclass may benefit more from anti-PD-1 treatment.

Relationship between immune-related subtypes and CNAs, TIL enrichment, PD-1/PD-L1 expression, and cancer stemness.
To illustrate the molecular mechanism underlying the different immune phenotypes, we analyzed the relationship between these immune-related subclasses and tumor molecular characteristics. CNAs are associated with distinct molecular characteristics, immunologic phenotypes, and clinical prognosis. Chromosomal instability is a common phenomenon in cancer and usually refers to arm and   46 . Consistent with the results, we observed that the immune class exhibited a significantly lower burden of copy number amplification than the non-immune class at both arm and focal levels, whereas there was no significant difference based on copy number deletion between the two classes ( Fig. 6a, b). Notably, we found that the number of TILs was significantly upregulated in the immune class compared with those in the non-immune class (Fig. 6c). Moreover, we found that the expression of PD-1 and PD-L1 was significantly upregulated in the immune class compared with that in the non-immune class (Fig. 6d, e). Therefore, these well-accepted biomarkers used for predicting ICI response support the potential response of immune class to immunotherapy, suggesting a promising role for the immune-related molecular classification in guiding clinical immunotherapy. Genetic characterization of tumor tissues has confirmed that high TMB and neoantigens, resulting from somatic non-synonymous coding mutations, may elevate tumor immunogenicity and increase susceptibility to immunotherapy 47 . We observed a different mutation profile among the three immune classifications via Mut-SigCV algorithm analysis (Fig. 6f). Notably, the mutation frequency of ACVR2A and GABRB in the immune-activated subclass was significantly higher than the immune-suppressed subclass and non-immune class (ACVR2A: 10.59% vs. 7.22% and 3.14%; GABRB: 8.24% vs. 3.61% and 1.57%; respectively). More mutations in LRRC7 and CDH10 were revealed in the immune-suppressed subclass than in other classes (LRRC7: 18.07% vs. 9.41% and 5.1%; CDH10: 14.46% vs. 8.24% and 4.71%; respectively). High TMB and neoantigen burden correlated with response to immunotherapy in various solid tumors, whereas this association was uncertain in MSS tumors 47 . Our study revealed that the TMB and neoantigen burdens were comparable between the immune and non-immune classes (Fig. 6g, h). Miranda et al. 48 showed a negative correlation between stemness feature, represented by the mRNAsi signature, and the immune response. Accordingly, the mRNAsi signature expression was significantly lower in the immune class than in the non-immune class, implying that the immune class had  www.nature.com/scientificreports/ reduced stemness and could be more responsive to immunotherapy (Fig. 6i). Collectively, these results indicate that the immune class possesses significantly higher TIL enrichment and PD-1/PD-L1 expression and lower copy number amplification and stemness, which may contribute to its immune response.

Combination of immune-related subtypes and microsatellite status precisely guide immunotherapy strategy.
Microsatellite status is an important factor involved in response to immunotherapy in CRC; patients with MSI CRC show a favorable outcome with immunotherapy. While only 40% patients with MSI CRC respond to ICIs, 60% patients with MSI CRC still do not respond to ICIs. Patients with MSI CRC who respond to ICIs and how to improve the response rate of patients with MSI CRC to ICIs are still unknown. We tries to integrate immune-related subclasses in patients with MSI CRC to explore heterogeneity in the effect of immunotherapy in patients with MSI CRC. We identified 72 patients with MSI from the GSE39582 datasets. Based on our established immune-related classification, these 72 patients were divided into non-immune, immune-suppressed, and immune-activated subclasses. Twenty-two (30.6%) patients were allocated to the immune-activated subclass, which exhibited higher immune cell infiltration and immune activation signaling, implying these patients were potential responders to ICI therapy (Fig. 7). Thirty-four (47.2%) patients were allocated to immune-suppressed subclass, which exhibited higher immune suppressive cells and immune suppressive signaling, such as TGF-β, PD-1, PD-L1, MDSC, macrophages, and Tregs ( Fig. 7 and Supplementary  Fig. S6). We recapitulated these results in the TCGA cohort; 51 patients with MSI CRC were identified, and 25 of them were allocated to the non-immune subclass, 20 patients to the immune-suppressed subclass, and 5 patients to the immune-activated subclass. Similarly, immune suppressive molecules and signaling were higher in the immune-suppressed subclass than in the non-immune and immune-activated subclasses, although no statistical significance was observed owing to small number of patients in the immune-activated subclass ( Supplementary  Fig. S7). Collectively, these results suggest that patients with MSI CRC allocated to the immune-activated subclass have a "hot" immune status. The tumor may be regressed by single ICI immunotherapy. In contrast, for patients in the immune-suppressed subclass, ICI therapy combined with a TGF-β inhibitor or an agent for the www.nature.com/scientificreports/ elimination of immune suppressive cells might improve efficacy. Our novel classification not only provides new insights and assists in identifying appropriate candidate patients with MSI CRC who will respond to ICIs, but also, through combination immune-related subtypes and MSI status, may help tailor optimal immunotherapeutic treatment and further improve clinical outcome of patients with CRC on immunotherapy.

Discussion
Accumulating evidence has demonstrated the essential role of the TME in cancer progression 49 . However, the immune landscape in CRC remains poorly understood. A thorough investigation of different immune and stromal components in the TME may help understand complex tumor heterogeneity, develop more precise therapies, and refine current immunotherapeutic strategies. Particularly, establishment of an immune-related classification may provide data on more accurate subtypes of patients with CRC to help clinicians select more effective therapies. In this study, we presented a comprehensive immune-related classification of CRC based on the NMF algorithm according to CRC expression profiles. Using the top 150 exemplar genes of the immune pattern for consensus clustering, three immune-related molecular groups were identified, namely immuneactivated, immune-suppressed, and non-immune. Molecular classification associated with different immune signaling pathways and immune cell types was established and validated in three independent cohorts. Notably, the immune-related classes correlated with patient prognosis and response to ICI therapy, and this finding might be used to tailor clinical immunotherapy strategies for patients with CRC. The NMF algorithm, a multiplicative update algorithm, can be used to dissect substantial volumes of data using a limited number of components 50 . Emerging evidence has suggested NMF to be a promising method for classifying tumor molecular subtypes 51 . Using the NMF algorithm, we identified an immune pattern in the training cohort. Through consensus clustering of the exemplar genes of that immune pattern, CRC could be distinguished as immune and non-immune classes. Of the 488 patients in the training cohort, 42.8% were grouped under the immune class, which showed higher enrichment of infiltrating immune cells, cytolytic activity, and IFN signaling than the non-immune class. Effective immunotherapy is often limited by the immunosuppressive TME, which includes immunosuppressive cells and cytokines. Calon et al. 52 revealed a pro-metastatic program induced upon secretion of IL11 by TGF-β-stimulated CAFs in the tumor niche. Accumulating evidence has suggested that the C-ECM level is upregulated by activated CAFs to trigger the recruitment of immunosuppressive cells 53 . To investigate the different components in the CRC TME, we further dissected the gene expression profiles of the immune class and divided them into immune-activated and immune-suppressed subclasses based on a stromal-activated signature calculated using the NTP method. Interestingly, 46.9% of the patients in the immune class were assigned to the immune-activated subclasses, which showed lower enrichment of the stromal-related signatures and immunosuppressive cells-such as TGF-β, C-ECM, PD-1, Treg, and MDSC signatures-than that observed in the immune-suppressed subclass. This immune molecular classification was also validated in three independent cohorts. For example, the immune-activated subclass accounted for 40.5% of the patients, whereas the immune-suppressed subclass accounted for 59.5% of the patients in the immune class of the GSE17538 cohort. These results indicate that NMF may be used as a helpful classifier for stratifying immune molecular subtypes of patients with CRC.
Currently, treatment paradigms have been shifting from targeting the tumor cell compartment to focusing on the development of strategies for regulating the TME 54 . As a smaller number of patients respond to ICI therapy, identification of effective biomarkers for predicting patient suitability to immunotherapy is warranted. Recently, emerged immunological and non-invasive methods, such as liquid biopsy using peripheral T cell receptor repertoire of PD-1 + CD8 + lymphocytes, provide a non-invasive approach for selecting patients with metastatic CRC who could benefit from ICI treatment 55 . Meanwhile, with the advances in computational methods, such as Sub-Map and TIDE, a novel algorithm was established to predict ICI therapy response. SubMap analysis is conducted using an algorithm to evaluate the similarity in gene expression profiles between patients in different molecular subtypes and in patients with metastatic melanoma treated with ICIs 56 . Notably, here, the results of the SubMap analysis showed that patients in the immune-activated subtypes exhibited similar gene expression profiles to those of patients with melanoma who responded to anti-PD-1 immunotherapy, which suggested that patients in the immune-activated subclass could respond to PD-1. Moreover, survival analysis shows that patients in the immune class, especially in the immune-activated subclass, presented with better OS and DFS. These results may have clinical implications in terms of formulating prognosis and treatment decisions.
Genomic studies have revealed that CNAs, TMB, and neoantigens demonstrate a predictive value for ascertaining susceptibility to immunotherapies 57 . Lower CNAs were linked to the elevated infiltration of immunocytes and cytokines, which implied a higher likelihood of successful ICI therapy 58 . We analyzed the relationship between the established molecular subtypes and genetic features. We observed that CNA amplification was significantly lower in the immune class than that in the non-immune class. The levels of other canonical biomarkers that could help predict response to ICI, such as TIL and PD-1/PD-L1 expression, were significantly higher in the immune class, which confirmed patients in the immune class could potentially benefit from immunotherapy.

Conclusions
Herein, we proposed a novel immune molecular classifier based on gene expression profiles analyzed using the NMF algorithm. Patients in the immune-activated subclass present with favorable prognosis and may respond positively to anti-PD-1 immunotherapy. This study provides new insights into tumor heterogeneity using an integrated and multifaceted approach to enhance the discovery of clinically important molecular subtypes that respond to immunotherapy.

Data availability
All data generated or analyzed during this study are included in this published article and its Supplementary Information files.