Ratio of the interferon-γ signature to the immunosuppression signature predicts anti-PD-1 therapy response in melanoma

Immune checkpoint inhibitor (ICI) treatments produce clinical benefit in many patients. However, better pretreatment predictive biomarkers for ICI are still needed to help match individual patients to the treatment most likely to be of benefit. Existing gene expression profiling (GEP)-based biomarkers for ICI are primarily focused on measuring a T cell-inflamed tumor microenvironment that contributes positively to the response to ICI. Here, we identified an immunosuppression signature (IMS) through analyzing RNA sequencing data from a combined discovery cohort (n = 120) consisting of three publicly available melanoma datasets. Using the ratio of an established IFN-γ signature and IMS led to consistently better prediction of the ICI therapy outcome compared to a collection of nine published GEP signatures from the literature on a newly generated internal validation cohort (n = 55) and three published datasets of metastatic melanoma treated with anti-PD-1 (n = 54) and anti-CTLA-4 (n = 42), as well as in patients with gastric cancer treated with anti-PD-1 (n = 45), demonstrating the potential utility of IMS as a predictive biomarker that complements existing GEP signatures for immunotherapy.


INTRODUCTION
Historically, advanced melanoma has a poor prognosis, with a 5year survival rate of less than 10%. 1 Immune checkpoint inhibitors (ICIs) targeting PD-1 and CTLA-4 have shown improved survival in advanced melanoma patients [1][2][3][4] , but only a subset of patients respond. In addition, the efficacy of ICIs has been observed to be significantly lower for East Asian melanoma patients than for Caucasian patients 5,6 .
Published data suggest that tumor mutational burden (TMB) and PD-L1 expression may predict the clinical benefit of anti-PD-1 therapy in multiple cancer types [7][8][9] . Although the potential predictive power of PD-L1 expression for the clinical benefit of anti-PD-1 therapy remains controversial for advanced melanoma patients 10,11 , higher TMB has been correlated with a superior clinical response 12,13 , improved survival 14,15 , and durable benefit 12,16 in advanced melanomas. In Asian melanoma patients, acral 17,18 and mucosal melanomas 18 are the predominant subtypes and generally have a low point mutation burden. Consequently, it is not clear whether TMB is an effective predictor for advanced Asian melanoma patients.
In addition to TMB and PD-L1 expression, prediction models based on GEPs have also been proposed. Most GEP signatures consider T cell inflamed microenvironments characterized by the upregulation of IFN-γ signaling, antigen presentation, and immune checkpoint-related genes when predicting response to ICIs across multiple cancer types. However, these features are necessary but not always sufficient for a cancer patient to receive clinical benefit from ICI treatments 19 . A recent meta-review showed that predictive models built on inflamed GEP signatures achieved a moderate area under the receiver operator curve (AUC) value of 0.65 for the summary receiver operation characteristic (sROC) curve generated from 10 different solid tumor types in 8135 patients 20 .
Here, we argued that immune suppressive elements in the tumor microenvironment (TME) should be considered in combination with an inflamed GEP signature to more accurately predict ICI therapy outcomes. The main objective of this study was to develop immunosuppressive GEP signatures that, when used in combination with inflamed GEP signatures, could better stratify patients based on their potential benefits from ICI therapy. We started by analyzing RNA-seq data from baseline biopsy samples of melanoma patients treated with anti-PD-1 therapy and identified a set of 18 genes that played an "antagonistic" role against a pro-inflammatory TME and lead to negative outcomes in the discovery cohort consisting of multiple datasets. Our results reveal that key genes of the identified IMS are related to hallmark activities of cancer-associated fibroblasts (CAFs), macrophages and epithelial to mesenchymal transition (EMT), and the balance between the IFN-γ signature and the IMS plays an important predictive role in both immunotherapy-naive primary tumors from The Cancer Genome Atlas (TCGA) database and ICI-treated patients.

Definition of an immunosuppression signature
We reviewed the literature and found three external datasets 14,15,21 of advanced melanoma treated with an anti-PD-1 ICI with response and RNA-seq data, which we used as our discovery cohort (n = 120; see "Methods" section). We then identified 18 genes of which the expression levels, after adjusting for IFN-γ signature score, are consistently associated with negative response to ICI in the discovery cohort as our IMS (FAP, PDGFRB,  CD163, SIGLEC1, IL10, CCL2, CCL8, CCL13, INHBA, VCAN, AXL, TWIST2,  ADAM12, COL6A3, STC1, ISG15 Table 1. Figure 1D shows a heatmap of the expression of all genes in the IMS and IFN-γ signatures in the combined discovery cohort. Patients with elevated expression of IFN-γ-related genes included both responders and nonresponders, suggesting that an inflamed TME alone, as indicated by a higher IFN-γ signature score, is not sufficient to ensure positive outcomes from ICI. On the other hand, elevated expression levels of IMS genes were observed in patients from the nonresponder groups in the IFN-γ+ subgroup (n = 49, p = 0.0006; Fig. 1E). These data indicate a potential immunosuppressive role of the IMS genes that is opposite to the role of the IFN-γ related inflammatory signature, and both signatures should be considered in order to have more accurate predictions of outcomes from immunotherapy.
Association of the immunosuppression signature with immune cell types The identified IMS shows a strong presence of genes related to CAFs (FAP and PDGFRB 22 ) and tumor-associated macrophages (TAMs) (CD163 23 and SIGLEC1 24,25 ), as well as their associated cytokines or chemokines that lead to an immunosuppressive microenvironment (IL10 26 , CCL2, CCL8, and CCL13 14 ) and stromal activities that lead to tumor proliferation, invasion and immune escape such as EMT or extracellular matrix (ECM) degradation (AXL 27 , TWIST2, ADAM12 28 , and COL6A3 29 ). Therefore, high infiltration of CAFs and myeloid cells and their related stromal activities may be the reasons behind the lack of response from patients with an inflammatory TME. To further validate this hypothesis, we performed digital cell composition analysis using xCell 30 on a combined melanoma dataset consisting of the three datasets in the discovery cohort and a TCGA melanoma dataset (n = 516) and in TCGA melanoma dataset only (n = 309), and analyzed the distribution of different immune cell types with respect to the IFNγ signature and IMS scores.
As expected, we observed that the IMS score was positively correlated with the abundances of fibroblasts (r = 0.62, p < 0.0001), monocytes (r = 0.45, p < 0.0001) and macrophages (r = 0.34, p < 0.0001) ( Fig. 2A). Stratification of patients into IFN-γ+/− and IMS+/− subgroups according to their median values further revealed the different distributions of immune cells in relation to these two signatures (Fig. 2B). Fibroblasts were significantly enriched in IMS + subgroups regardless of IFN-γ status (p < 0.0001; Fig. 2C). In addition, higher abundances of macrophage were associated with both higher IMS scores and higher IFN-γ signature scores. Interestingly, M2 macrophages, which play an important immunosuppressive role in the TME, were significantly associated with the IMS score in only the IFN-γ+ subgroups  IFN-γ   IMS   IMS score   IFNG  STAT1  CCR5  CXCL9  CXCL10  CXCL11  IDO1  PRF1  GZMA  HLA-DRA  CCL8  VCAN  CCL2  BCAT1  ISG15  CD163  AXL  CCL13  COL6A3  SIGLEC1  PDGFRB  IL10  STC1  ADAM12  OLFML2B  FAP  TWIST2 Fig. 1 The definition of the IMS genes. A-C Volcano plot depiction of differentially expressed genes after normalization by the IFN-γ score of individual sample by response on Riaz17 (in A, n = 51), Gide19 (in B, n = 41) and Hugo16 (in C, n = 28). R, responders (CR or PR); NR, nonresponders (PD) as per RECIST 1.1. IMS genes are highlighted in red. D Heatmap showing the expression of genes from the IFN-γ signature and the IMS stratified by IFN-γ and response to anti-PD-1 therapy. Rows represent genes and columns represent patients. The expression levels were z-normalized within rows for visualization. The cutoff value for the IFN-γ signature score was set to its median. The IMS scores in responders versus nonresponders for IFN-γ+ (E) and IFN-γ− (F) subgroups.
(p = 0.0281) but not the IFN-γ− subgroups. On the other hand, higher IFN-γ signature scores were associated with increased infiltration of CD8+ T cells, CD4+ T cells and B cells. However, the association of IMS scores and abundances of these cells within the microenvironment is not significant. Similar association can be observed on cell type abundance results from TCGA melanoma dataset only ( Supplementary Fig. 1). All these results are consistent with the notion that IMS genes are related to immunosuppressive activities in cancers, and the balance between IFN-γ signature and IMS scores has a significant role in determining which patients benefit from adaptive immune rejuvenating therapies.
Balance between the IFN-γ signature and the IMS as a biomarker for cancer We next studied the distribution of IMS scores and their interaction with IFN-γ signature scores in different tumor types using TCGA data. First, we analyzed the correlation between IMS scores and IFN-γ signature scores for all TCGA patients (n = 11,043; Fig. 3A). The results showed that the IFN-γ signature and IMS scores had a modest positive correlation with r = 0.53 (p < 0.0001); however, IMS scores were poorly explained by IFN-γ signature scores (R 2 = 0.28), suggesting that these two signatures are not fully overlapping and might contribute complementary information regarding the TME. A similar conclusion can be made on the correlation of the IMS and IFNγ scores on selected cancer types from TCGA ( Supplementary Fig. 2). Given that the status of the tumor immune microenvironment and the associated composition of immune cells contain prognostic information, we hypothesized that the balance between IFN-γ signature and IMS may be associated with the survival of cancers. To assess this possibility, we performed a stratified multivariate analysis using Cox proportional hazards regression within each TCGA cancer type. The results showed that the association between the ratio of IFN-γ signature to the IMS score (IFN-γ/IMS) and overall survival (OS) varied according to cancer type (Fig. 3D). A higher IFN-γ/IMS ratio was associated with a modest prognostic benefit after adjusting for sex, age, and TMB in breast invasive carcinoma (BRCA) (HR = 0.92; 95% CI: 0.82-0.98), cutaneous melanoma (SKCM) (HR = 0.89, 95% CI: 0.87-0.99), stomach adenocarcinoma (STAD) (HR = 0.86; 95% CI: 0.77-0.96), bladder urothelial carcinoma (BLCA) (HR = 0.93; 95% CI: 0.87-0.99) and cervical tumors (CESC) (HR = 0.85; 95% CI: 0.74-0.98). Conversely, a higher IFN-γ/IMS ratio was associated with poor prognosis in uveal melanoma (UVM) (HR = 1.67; 95% CI: 1.17-2.38), uterine carcinosarcoma (UCS) (HR = 1.49; 95% CI: 1.01-2.20) and brain lower-grade gliomas (LGG) (HR = 1.41; 95% CI: 1.06-1.88), suggesting that these cancers may have different antitumour immune responses than those cancers mentioned previously. Interestingly, it was previously reported that a higher TMB was associated with poor survival in patients with glioma 7 . Associations of the IFN-γ/IMS ratio and survival in different directions have also been observed in other cancer types but did not reach statistical significance. Furthermore, we tried to include tumor stage (III/IV versus I/II) in the Cox model for cancer types with this information available on TCGA and found that the association of the IFN-γ/IMS ratio and survival remains significant for BRCA, BLCA, SKCM and UVM after adjusted for tumor stage ( Supplementary Fig. 1).
We further checked the relationship between the IFN-γ/IMS ratio and TMB scores in TCGA data and found that there was a positive but weak association between them (r = 0.20, R 2 = 0.042, p < 0.0001; Fig. 3B) on all samples from TCGA datasets (n = 11,043) and on selected cancer types ( Supplementary Fig. 2).
Finally, following the same method as in 31 , we compared the median values of the IFN-γ signature score, the IMS score, and their ratio with objective response rates (ORRs) to anti-PD-1 therapies for cancer types with efficacy performance data available 32 to get an educated guess of the applicability of these scores as potential biomarkers for immunotherapy. A positive correlation of ORR with the IFN-γ signature score (R 2 = 0.27, p = 0.047, Fig. 3E) and IFN-γ/IMS (R 2 = 0.54, p = 0.001, Fig. 3G) was observed. Importantly, tumors with high median IFN-γ/IMS values, most notably SKCM 33 , colon adenocarcinoma (COAD), CESC 34 , BLCA 35 , lung squamous cell carcinoma (LUSC) 36 and liver hepatocellular carcinoma (LIHC), have shown clinical sensitivity to ICI therapies (Fig. 3G). Some tumor types (e.g., pancreatic adenocarcinoma (PAAD) and BRCA) have shown poor response to immunotherapy despite their moderate to high median IFN-γ scores (Fig. 3E). These tumors are known to be highly infiltrated with myeloid cells, which may serve as an additional immunosuppressive mechanism preventing efficacy with ICI therapy 37,38 .
Notably, these cancer types were also characterized by elevated IMS scores (Fig. 3C).
Ratio of the IFN-γ signature score to the IMS score predicts PD-1 blockade efficacy We next assessed whether directly using the ratio of the IFN-γ signature score to the IMS score could be used as a reliable metric to predict anti-PD-1 therapy outcome for melanoma patients. Patients with higher IFN-γ/IMS scores have better chance to receive clinical benefits from anti-PD-1 therapy ( Fig. 4A-C). We used IFN-γ/IMS together with the clinical response data to generate receiver operating characteristic (ROC) curves to quantify its prediction performance in our discovery cohorts. The resulting AUCs were in the range of 0.70-0.83 (Fig. 4B), which is better than other GEP signatures from the literature ( Fig. 4D-F). In addition, higher IFN-γ/IMS scores also associated with improved OS in the Riaz17 and Gide19 datasets ( Fig. 4G-H). We next tested the prediction ability of IFN-γ/IMS in a newly generated RNA-seq dataset from 55 tumor tissues of melanoma patients treated with anti-PD-1 monotherapy at Peking University Cancer Hospital (PUCH), Beijing, China. In this dataset, IFN-γ/IMS achieved a prediction accuracy of AUC = 0.81 (95% CI: 0.69-0.93; Fig. 5E). Using the threshold that generated the maximum Youden index 39 to divide patients into predicted responder (n = 29) and predicted nonresponder groups (n = 26), IFN-γ/IMS successfully captured 67.71% of nonresponders (23 out of 35) with only 3 exceptions (1 patient with a complete or partial response (CR/PR) and 2 patients with stable disease (SD) were misclassified as nonresponders), achieving a classification accuracy of 88.5% (p = 0.0006) for this group. On the other hand, of the 29 patients classified as predicted responders by IFN-γ/IMS, 17 (13 patients  Fig. 6G). Collectively, the above data demonstrate the potential value of IFN-γ/IMS ratio as a combinatorial biomarker for anti-PD-1 treatment for metastatic melanoma.
Although the IMS was derived from advanced melanoma cohorts receiving anti-PD-1 treatment, the resulting genes measure immune-related expression levels with minimal contribution from tumor-related transcriptomic activities. Therefore, the signature may provide a treatment or tumor-type agnostic insight into immune microenvironment activities. To test this concept, we further evaluated the prediction performance of IFNγ/IMS on two publicly available RNA-seq datasets with pretreatment samples from melanoma patients treated with anti-CTLA-4 therapy (VanAllen15; n = 42) 40 and metastatic gastric cancer patients treated with anti-PD-1 therapy (Kim18; n = 45) 41 . The resulting AUCs were 0.75 (95% CI: 0.59-0.91; Fig. 6B) and 0.82 (95% CI: 0.64-0.99; Fig. 6C), respectively, for these two datasets. In addition, patients with high IFN-γ/IMS ratios (using the Youden index to determine the cutoff point) had better ORR on the VanAllen15 dataset (p = 0.0004; Supplementary Fig. 4E) and the Kim18 dataset (p = 0.0022; Supplementary Fig. 4F) and longer OS in the VanAllen15 dataset (HR = 3.06; 95% CI: 1.41-6.61; p = 0.0032; Fig. 6H) than patients with low IFN-γ/IMS ratios, suggesting the potential of using the IFN-γ/IMS ratio as a predictive biomarker for immunotherapies different to anti-PD-1, or other cancer types.

Comparison with other GEP signatures
Currently, there were significant number of independent studies on GEP signatures that predict the response of patients to anti-PD-1 therapy. To compare the prediction accuracy of the proposed IFN-γ/IMS ratio with that of existing GEP signature-based predictors, we generated predictions using nine published GEP signatures (Table 1) and the comparison results showed that IFN-γ/IMS ratio achieved same or better AUC performance on both PUCH (Fig. 5F) and the three external validation cohorts (Fig. 6D-F). One limitation of existing GEP signature studies is that many of these signatures were validated with independent cohorts within each publication, but frequently these signatures have not performed well in follow-up reports. To further validate the robustness of the proposed approach, we performed a randomized permutation test where three datasets were randomly selected from the seven datasets (Table 2) as the discovery cohort to identify the top 18 IMS genes as described previously. We then tested the prediction performance of ratio of IFN-γ and the identified IMS on the remaining four datasets. The results from the total 35 permutation tests indicated that IFN-γ/IMS outperformed other GEP signatures by a significant margin (Wilcoxon matched-pairs signed rank test, p < 0.0001; Fig. 7A). Significantly, we found that the IMS signatures from the randomized tests were highly consistent despite that they were obtained from different training datasets. More than half of the total 630 occurrences of

DISCUSSION
There is significant interest in developing robust biomarkers of response to immunotherapy, as well as identifying actionable targets in those who do not respond to the current standard ICI therapies. Gene expression biomarkers, such as Oncotype DX 43 , have demonstrated clinical utility in predicting treatment benefits in breast cancer. However, as interactions between the tumor and its microenvironment are highly complex, constructing predictors of patient response to ICIs remains a serious challenge. Existing efforts to create gene expression-based tests for ICI efficacy have mainly focused on developing "response signatures" that measure the expression of adaptive immune response-related inflammatory genes 20 , most of which include an IFN-γ gene signature as a major component 44 . However, due to the presence of intricate immunosuppressive mechanisms within the TME, the presence of a peripherally suppressed adaptive immune response alone appears to be necessary but not sufficient for clinical benefit from PD-1/PD-L1 blockade. In this study, we identified an immunosuppression signature that, when combined with an inflammatory signature, had predictive value in patients with advanced melanoma treated with PD-1 blockade. To maximize our chance in identifying the correct genes for this signature, we started with an established 10-gene IFN-γ signature measuring the expression of genes associated with cytotoxic cells, antigen presentation, and IFN-γ activity 44 , and then selected genes that were significantly upregulated in nonresponsive versus responsive groups after they were normalized by IFN-γ signature scores of individual patients using one-sided Student's t-test. To avoid potential batch effect from different datasets, the Student's t-tests were conducted individually on each dataset from the discovery cohort. In addition, since conceptually, a truly predictive gene should produce a significant result in all datasets, we used the Pearson's method 45 to combine p-values from different datasets to identify the IMS genes. As the Pearson's method is more sensitive to the largest p-value when combining p-values from multiple tests, this approach helps to avoid artefacts due to single significance from individual dataset 46 .
Interestingly, the genes identified in our IMS through the above computational method were highly consistent with several important biological activities related to innate or acquired resistance to ICIs. CAFs are a nonredundant, immunosuppressive component of the TME 47,48 . It was previously reported that inhibin β A (INHBA) production by cancer cells helps to induce CAFs, and ablating INHBA decreases the CAF phenotype both in vitro and in vivo 49 . CAFs hinder antitumour immunity by secreting immunosuppressive cytokines such as IL-10 and TGF-β, reducing the function and viability of cytotoxic T lymphocytes 50 (Table 1) in predicting response to ICI.
CCL2 48,52 . Notably, Siglec-1/CD163 is associated with the activation of macrophages towards an immunosuppressive phenotype, and accordingly, the expression of both CAF (FAP) and TAM (SIGLEC1) markers is associated with poor clinical outcomes across multiple tumor types [53][54][55][56] . CAF-mediated EMT, which is strongly correlated with the expression of AXL, TWIST2 and ADAM12 from the IMS, can result in biomechanical and biochemical changes that facilitate tumor immune escape, invasion, and metastasis 57 . The dense collagen matrix produced by CAFs may also present a physical barrier to the infiltration of T lymphocytes 58 or treatments reaching the cancer cells 59 . Indeed, the association between a lack of response to ICIs and upregulated EMT-related genes has been observed in multiple cancers 41,60 , and inhibiting CAF/TAM-related pathways and extracellular collagen and hyaluronan can induce T cell accumulation and improve the outcome of ICIs 61-64 , reinforcing the role of those stromal-related activities in limiting the efficacy of immune checkpoint blockade immunotherapy. Using the ratio of opposing immune signatures instead of the absolute value of individual signatures as a predictive biomarker brings another advantage. It is well known that to compensate for potential technical variation, raw gene expression data from RNA-seq must be normalized so that meaningful biological comparisons can be made 65 . Typically, this is done with a set of housekeeping genes that are expected to maintain constant expression levels under different experimental conditions 66 . However, it has become increasingly clear that housekeeping gene expression levels may vary considerably in some conditions 67,68 . When that happens, the normalization process itself can lead to increased intersample "noise" that covers meaningful differences in target genes if the chosen housekeeping genes fluctuate randomly or erroneous results if there is a directional change in the housekeeping genes between experimental groups 67,69 . The calculation of the IFN-γ/IMS ratio provides a self-normalization method that directly measures the balance between contradicting biological processes within the tumor microenvironment, thus providing a selection process that is more robust to the confounding factor of intersample gene expression variations to identify genes that contribute negatively to the outcome of immunotherapy.
Our study has limitations. Since the current ICI clinical trials have generated gene expression data for only a limited number of pretreatment samples, which were insufficient to train robust prediction models, we did not systematically optimize the weights of individual genes in the IFN-γ/IMS ratio calculation. With more RNA-seq data available from subsequent studies, we expect that further optimization of the combined biomarker will yield even better predictive accuracy. In this study, we did not attempt to specify a universally applicable cutoff point for IFN-γ/IMS for different datasets due to potential batch effects from different RNA-seq procedures conducted at multiple sites. Rather, we demonstrated a trend that shows an increase in benefits with increasing IFN-γ/IMS ratios. Nevertheless, we envision that a relevant cutoff would need to be aligned to specific assay designs and clinical situations as the cutoff value may be affected by the sampling bias from intra-tumor heterogeneity 70 , as well as the difference in the cellular composition of primary and metastatic tumors. The IFN-γ/IMS ratio was primarily designed as a predictive biomarker for anti-PD-1 therapy for melanoma. Therefore, the OS results reported in this paper should not be overinterpreted since OS is determined not only by response but also by events not considered in the current model such as acquired therapy resistance, adverse event, or presence of other targetable mutations, which need to be further evaluated towards a more meaningful prognostic biomarker. Finally, our analysis is retrospective in nature, and validation of the findings in additional datasets is warranted.
In conclusion, the IMS studied in this paper exemplifies the potential of using GEP signatures for modeling the adverse TME, and using IMS in combination with an existing inflammatory GEP signature enables better identification of patients who could respond favorably to ICIs. Currently, clinical trials are assessing the efficacy of combining anti-PD-1 therapy with medicines that target at normalization of immune suppressive TME including the CSF1R inhibitor Cabrilizumab for the treatment of resectable biliary tract cancer (NCT03768531), CCR2 inhibitor plozalizumab for the treatment of melanoma (NCT02723006), the FAP inhibitor RO6874281 for the treatment of metastatic head and neck, esophageal or cervical cancers (NCT03386721) and metastatic melanoma (NCT03875079), and the TGF-β inhibitor galunisertib (LY2157299) for the treatment of advanced-stage NSCLC or hepatocellular carcinoma (NCT02423343) and metastatic pancreatic cancer (NCT02734160). However, due to the diversity of the immune evasion mechanisms in inflammatory tumors, specific immunosuppressive mechanisms utilized by each individual tumor would still need to be fully understood and gauged to better direct patients to different combination therapy options. PUCH samples are mostly acral or mucosal, and are not biologically similar to samples from the TCGA or other melanoma datasets used in this study, which are mostly cutaneous. Acral or mucosal melanoma in primary site often has deeper Breslow thickness, higher incidence of ulceration, and is more prone to visceral metastasis. Compared with cutaneous melanoma. acral or mucosal melanoma has lower TMB 17 and shows decreased response to PD-1 monotherapy 5,11 . Although recent research found that the efficacy could be greatly improved by PD-1 based combination therapy in mucosal melanoma cohort 71 , the molecular mechanisms and predictive biomarkers are yet to be explored. Nevertheless, results in this study indicate that the IMS signature plays a very similar role in profiling the immune suppressive microenvironment of acral or mucosal melanoma although it was derived from cutaneous melanoma. In this regard, it is anticipated that IMS or future immunosuppressive signatures gleaning through deeper understanding of the immunosuppressive mechanisms of cancer would enable the development of more effective stratification models or therapeutic combinations to increase the efficacy and cost-effectiveness of immunotherapies for the benefits of cancer patients.

Patients and specimens
In this study, we obtained 55 formalin-fixed, paraffin-embedded (FFPE) pretreatment tumor tissues from melanoma patients treated with anti-PD-1 monotherapy at PUCH, Beijing, China, between March 2016 and March 2019 (Table 3). Diagnosis was histopathologically confirmed for all patients. Clinical data, including sex, age, tumor site, tumor thickness, metastasis status, and clinical efficacy, were collected. Therapy outcomes evaluated following Response Evaluation Criteria in Solid Tumors (RECIST) version 1.1, including presence of a CR or PR, SD and progressive disease (PD), were used to assess efficacy. OS was calculated from the treatment start date. Patients who did not die were censored at the date of last contact. The study followed the REMARK reporting guidelines.

Whole-transcriptome RNA sequencing
Total RNA was extracted from unstained FFPE tumor samples by the All Prep-DNA/RNA-Micro Kit (Qiagen) following the standard manufacturer's protocol. Reverse transcription and second-strand cDNA synthesis were subsequently performed. Barcoded RNA libraries were generated and captured by a customized whole-exome panel. All libraries were sequenced on the Illumina NovaSeq 6000 platform with 2 × 150 bp paired-end reads. The mean sequencing coverage across all samples was ∼100× (3.5 G). RNA-seq reads were mapped to the human reference genome GRCh37 using STAR 72 , and gene expression was quantified using RSEM 73 . Coding region reads were counted to calculate fragments per kilobase of transcript per million mapped reads (FPKM) values at the gene level and log2-transformed before analysis to avoid extremely skewed gene expression distributions.

External data sources
We collected the RNA-seq data of melanoma patients from six immunotherapy studies with gene expression profiles for pretreatment tumors and complete clinical information, including the Riaz17 (n = 51) 15 , Hugo16 (n = 28) 14 , Gide19 (n = 41) 21 , VanAllen15 (n = 42) 40 , Liu19 (n = 54) 13 , and Kim18 (n = 45) 41 datasets (Table 2). Patients from these clinical studies were treated with nivolumab 15 and/or pembrolizumab 14,21 . For the Gide19 and Liu19 studies, only baseline data from samples that received anti-PD-1 monotherapy (nivolumab or pembrolizumab) were used. The immunotherapy outcomes provided in the original publications following RECIST guidelines (PR/CR/SD/PD) were used in our analysis. The gene expression data of VanAllen15 and Liu19 were downloaded from respective references as provided by the authors. For Riaz17, Hugo16, Gide19 and Kim18, the RNA-seq raw data was obtained and processed by the above-mentioned pipeline to generate the gene expression data. We downloaded TCGA Level-3 RSEM-normalized RNA-seq data and mutation packager calls from the TCGA database. The RNA-seq data were

Identification of the IMS genes
To identify the IMS genes, we first normalized the gene expression levels of a patient with respect to his/her IFN-γ signature score. We then performed a one-sided Student's t-test to capture genes that were significantly uplifted in the nonresponse group (PD) versus the response group (PR/CR) after they were normalized by the IFN-γ signature scores. Due to the large dimensionality of the data, we restricted our search to the 770 cancer immune-related genes curated in Nanostring's IO 360 panel. In addition, to avoid potential batch effects from different datasets, the one-sided Student's t-test was performed on each individual dataset independently and the resulting p-values from the three datasets in the discovery cohort were then combined using Pearson's method 45 to avoid artefacts due to single significance from individual dataset. Finally, the genes were ranked based on their Pearson combined p-values, and the top 18 genes were identified as our IMS genes.

Calculation of GEP signatures
We collected nine published GEP signatures 44,[74][75][76][77][78][79][80] (Table 1) related to the immune checkpoint response from the literature and validated in our cohorts. Sample-wise scores of these signatures were calculated from RNAseq data following the methodology described in the corresponding papers. Genes with unavailable expression data were excluded from the calculation of signature scores. For the IFN-γ signature in this paper, we used the arithmetic mean of the log2-transformed, house-keeping gene-normalized expression levels of the 10-gene "preliminary" IFN-γ signature (IFNG, STAT1, CCR5, CXCL9, CXCL10, CXCL11, IDO1, PRF1, GZMA, and HLA-DRA) 44 . Similarly, the IMS score was calculated as the arithmetic mean of the log2-transformed, housekeeping gene-normalized expression levels of the 18 IMS genes listed in Supplementary Table 1. Furthermore, the IFN-γ signature/IMS ratio was calculated as the difference between these two scores in the logarithmic domain.

Single cell RNA-seq
Briefly, scRNA-seq data of 31 melanoma tumors were downloaded from GEO database (GSE115978) 42 . The original expression profiles and cell type annotations were used. Principal component analysis (PCA) was performed to reduce the dimensionality of the scRNA-seq profiles. Then t-SNE projections were generated using the first 25 principal components. Both PCA and t-SNE analysis were performed by RunPCA and RunTSNE functions in the Seurat package (version 3.1.0) with default parameters.

Data analysis and statistical information
Associations between categorical measurements and patient groups, such as the predictive accuracy of different biomarkers/panels, were evaluated using Fisher's exact test. Differences in continuous measurements were tested using the two-tailed Mann-Whitney U-test. Correlations between two groups of continuous variables were evaluated using Pearson correlation analysis. Multivariable Cox proportional hazard regression models were used to identify prognostic factors, and the results were reported as hazard ratios (HRs) with 95% confidence intervals. The Kaplan-Meier method was utilized to estimate OS and PFS curves, and difference between groups were assessed using the log-rank test. Twosided p-values were used unless otherwise specified, and a p-value less than 0.05 was considered significant. For boxplots, center mark is median and whiskers are minimum/maximum unless specified otherwise.
To validate the robustness of the proposed approach in identifying the IMS GEP signature, we performed a randomized permutation test where three datasets were randomly selected from the seven datasets (Table 2) as the discovery cohort to identify the top 18 IMS genes as described previously. The prediction performance of the ratio of IFN-γ to the identified IMS was then evaluated on the remaining four datasets. The resulting AUCs were compared with those of other GEP signatures on the same testing datasets using Wilcoxon matched-pairs signed rank test.
PRISM was used for basic statistical analysis and plotting (http://www. graphpad.com), and the Python language and programming environment were used for the remainder of the statistical analysis. The abundances of multiple cell types in whole tissue samples were estimated using xCell 30 .

Ethics approval
All the procedures including the collection, processing and analysis of tumor samples in this study were approved by the Institutional Review Board of the PUCH. Written informed consent was obtained from all participants.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
The raw sequence data from PUCH dataset reported in this paper have been deposited in the Genome Sequence Archive in National Genomics Data Center, Beijing Institute of Genomics (China National Center for Bioinformation), Chinese