Immune classification of clear cell renal cell carcinoma

Since the outcome of treatments, particularly immunotherapeutic interventions, depends on the tumor immune micro-environment (TIM), several experimental and computational tools such as flow cytometry, immunohistochemistry, and digital cytometry have been developed and utilized to classify TIM variations. In this project, we identify immune pattern of clear cell renal cell carcinomas (ccRCC) by estimating the percentage of each immune cell type in 526 renal tumors using the new powerful technique of digital cytometry. The results, which are in agreement with the results of a large-scale mass cytometry analysis, show that the most frequent immune cell types in ccRCC tumors are CD8+ T-cells, macrophages, and CD4+ T-cells. Saliently, unsupervised clustering of ccRCC primary tumors based on their relative number of immune cells indicates the existence of four distinct groups of ccRCC tumors. Tumors in the first group consist of approximately the same numbers of macrophages and CD8+ T-cells and and a slightly smaller number of CD4+ T cells than CD8+ T cells, while tumors in the second group have a significantly high number of macrophages compared to any other immune cell type (P-value \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$<0.01$$\end{document}<0.01). The third group of ccRCC tumors have a significantly higher number of CD8+ T-cells than any other immune cell type (P-value \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$<0.01$$\end{document}<0.01), while tumors in the group 4 have approximately the same numbers of macrophages and CD4+ T-cells and a significantly smaller number of CD8+ T-cells than CD4+ T-cells (P-value \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$<0.01$$\end{document}<0.01). Moreover, there is a high positive correlation between the expression levels of IFNG and PDCD1 and the percentage of CD8+ T-cells, and higher stage and grade of tumors have a substantially higher percentage of CD8+ T-cells. Furthermore, the primary tumors of patients, who are tumor free at the last time of follow up, have a significantly higher percentage of mast cells (P-value \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$<0.01$$\end{document}<0.01) compared to the patients with tumors for all groups of tumors except group 3.


cells, macrophages, and CD4+ T-cells. Saliently, unsupervised clustering of ccRCC primary tumors based on their relative number of immune cells indicates the existence of four distinct groups of ccRCC tumors. Tumors in the first group consist of approximately the same numbers of macrophages and CD8+ T-cells and and a slightly smaller number of CD4+ T cells than CD8+ T cells, while tumors in the second group have a significantly high number of macrophages compared to any other immune cell type (P-value < 0.01 ). The third group of ccRCC tumors have a significantly higher number of CD8+ T-cells than any other immune cell type (P-value < 0.01 ), while tumors in the group 4 have approximately the same numbers of macrophages and CD4+ T-cells and a significantly smaller number of CD8+ T-cells than CD4+ T-cells (P-value
Clear cell renal cell carcinoma (ccRCC) is the most frequently diagnosed malignant tumor type in the adult kidneys consisting of approximately 85% of kidney cancer cases 1 , and surgical resection is the common therapy type for ccRCC. However, it is not effective for patients with advance or metastatic ccRCC 2 . Several immunotherapeutic approaches have been recently used for treating patients with ccRCC 3,4 , which is considered a morphologically and genetically immunogenic tumor 5 . However, many patients do not respond to these treatments and develop adaptive or intrinsic resistance. We can increase the response rate to these treatments by identifying types of tumors that would respond to them.
Several studies show that cancer cells and tumor infiltrating immune cells (TIICs), which have important roles in both regulation of cancer progression and promotion of tumor development 6,7 , play an important role in the determination of malignant tumor types 8,9 . Tumor-infiltrating lymphocytes (TILs), which include T-cells and B cells, are an important category of TICCs. CD4+ helper T-cells and cytotoxic CD8+ T-cells play a significant role in preventing tumor by targeting antigenic tumor cells 10 , and CD8+ T-cells are linked with better clinical outcomes and reaction to immunotherapy in many cancers 11,12 . Furthermore, it has been recently observed that tumor associated B cells, which have significant roles in the immune system by producing antibodies and presenting antigens, could be predictors of survival and response to immune checkpoint blockade therapy 13 . Additionally, a relationship between TIICs gene signatures and lower survival rates has been observed in ccRCC patients, and tumor-associated macrophages (TAM) and 22 T cell phenotypes are found to be correlated with clinical outcomes 14,15 . These observations emphasize on importance of analyzing the cellular heterogeneity of tumors, including immune cell variations, to identify target tumors for each specific treatment and design new effective cancer treatments 16 .
There are some experimental approaches such as single cell analysis tools, including immunohistochemistry and flow cytometry to observe tumor immune infiltrates, however these methods are expensive and time consuming, and they are limited to analyzing a few immune cell types simultaneously 17 . For this reason, several The most frequent immune cells in ccRCC tumors are macrophages, CD4+ T-cells, and CD8+ T-cells. It has been found in the experimental studies that T cells (CD4+ T-cells and CD8+ T-cells) are the main immune cell population in the ccRCC tumors 14,20 . Results of experimental study done by Chevrier et al. 14 show that macrophages are the most frequent immune cells in most ccRCC tumors with a mean of 31% followed by CD8+ T-cells and CD4+ T-cells, respectively (H), which are in agreement with the results of CIBERSORTx applied on TCGA data set (Fig. 1C,I).
There is a negative correlation between the number of macrophages and CD8+ T-cells. The results of mass cytometry analysis indicate a negative correlation between CD8+ T-cells and macrophages with Pearson correlation coefficients of − 0.67 . Importantly, the digital cytometry applied on TCGA data set confirms this negative correlation between the number of CD8+ T-cells and macrophages in ccRCC with a correlation coefficient of −0.46 (Fig. 1D,E). There is no significant differences in overall survival months or age at diagnosis between clusters. Figure 2 indicates no significant differences in the overall survival of patients between any of these clusters; this figure also reveals some other interesting observations. For example, patients in Cluster CD4 < CD8 ≈ M� with and without tumors at the last time of follow up have a similar overall survival months while in all other clusters patients with tumor have a substantially lower survival months than patients without tumors at the last time of follow up (Fig. 2G). Moreover, patients with tumor in this cluster have a remarkably higher age at diagnosis compared to the patients with no tumors in this cluster (Fig. 2J). Furthermore, female patients in this cluster have a noticeably higher age at diagnosis but the same survival as male patients in this cluster (Fig. 2H,K). Additionally, female patients in Cluster CD4 < M� < CD8 have a substantially higher overall survival months than male patients in this cluster, while females have a slightly higher age at diagnosis than males in this cluster. Importantly, there is no significant differences in the age at diagnosis and survival months of patients in each cluster based on the location of their primary tumors, left and right kidneys (Fig. 2I,L).

Higher grade and stage of ccRCC tumors have higher percentage of CD8+ T-cells and lower percentages of mast cells and monocytes.
A study of 87 ccRCC patients indicates that the percentage of tumor infiltrating CD8+ T-cells co-expressing PD-1 and Tim-3 is correlated with an aggressive phenotype and a larger tumor size at diagnosis 22 . In another study, it has been found that the grade of ccRCC tumors is an increasing function of CD8+ T cells 20 . Figure 3 also indicates that the grade 3-4 and stage T3-T4 ccRCC tumors have a significantly higher percentage of CD8+ T-cells compared to the stage T1-T2 and grade 1-2 tumors (P-value < 0.01 ), which is consistent with the observations of Fig. 2. Figure 3 also indicates that the percentages of mast cells and monocytes in ccRCC tumors significantly decrease when the grade and stage of tumors increase (P-value < 0.01 ). Note, Clusters ( CD8 < CD4 < M� ) and ( CD8 < CD4 ≈ M� ) that have higher frequency of mast cells and monocytes and lower frequency of CD8+ T-cells have the least percentage of grade three and four tumors (Figs. 1J and 2). Some studies have reported a correlation between a high density of CD8+ T-cells in RCC patients and shorter overall survival 23 24 . Similarly, we observe that patients in Cluster ( CD4 < M� < CD8 ), which has the highest amount of CD8+ T cells, have the worst survival outcome among all clusters (Fig. 2).
Tumor free patients have a significantly higher percentages of mast cells in their primary tumors. NK cells are known for their roles in immune surveillance and destruction of tumor cells 25,26 . Moreover, flow cytometric and immunohistochemistry analyses show that a high number of NK cells is associated with improved survival 23 and negatively correlated with the grade of tumor 20 . Also, Fig. 4A shows that primary tumors of patients who are tumor free at the last time of follow up has a significantly higher level of NK cells compared to the patients with tumor (P-value < 0.01 ). However, a closer look in clusters reveal that the significant difference (P-value < 0.01 ) in percentage of NK cells between tumor free and with tumor patients corresponds to the patients in Cluster ( CD4 < CD8 ≈ M� ) (Fig. 4B).
In a recent study, 259 ccRCC patients have been clustered into two groups based on their immunohistochemistry profiles, and it has been observed that patients in the cluster with a high mast cells infiltration have a better response to treatments and a higher survival 24   Interferon γ ( INFγ ), encoded by INFG gene, is a cytokine that is essential for innate and adaptive immunity. It works as an activator of macrophages and stimulator of NK cells and neutrophils 27 , and it is mostly produced by T-cells and NK cells as a reaction of a variety of inflammatory or immune stimuli 28 . Saliently, expression level of INFG is significantly positively correlated with the percentage of CD8+ T-cells and the expression level of PDCD1 in ccRCC tumors, with correlation coefficients of 0.79 and 0.87, respectively. In addition, cluster ( CD4 < M� < CD8 ) has the highest INFG expression level and cluster ( CD8 < CD4 ≈ M� ) has the lowest expression level of INFG as expected (Fig. 5).
In contrast, there is a slightly positive correlation between the expression levels of CD274 and PDCD1LG2 genes, that encodes PD-L1 and PD-L2 respectively, with the expression levels of PDCD1 and INFG, and the percentage of CD8+ T-cells in ccRCC tumors (Fig. 5E). In addition, cluster ( CD8 < CD4 ≈ M� ) has the lowest levels of CD274 and PDCD1LG2 compared to the other clusters (Fig. 5B,D).

There is a significant association between RGS5 expression level and the percentages of NK cells, monocytes, and mast cells. RGS5 is a member of the regulators of G protein signaling (RGS)
family, and they are known as signal transaction molecules that are associated with the arrangement of heterotrimetric G proteins by acting as GTPase activators. Moreover, RGS5 is a hypoxia-inducible factor-1 dependent involved in the induction of endothelial apoptosis. In our previous study on TCGA data, we found that a high expression level of RGS5 in ccRCC primary tumors is associated with better survival months, and when the grade of ccRCC tumor increases, the RGS5 expression level significantly decreases 29 . Interestingly, cluster ( CD8 < CD4 ≈ M� ) has the highest RGS5 expression level compared to the other clusters, and tumor free patients have a higher level of RGS5 expression than patients with tumor (Fig. 6A). Saliently, ccRCC tumors

Discussion
Immune checkpoints are essential parts of immune system, and they are crucial to prevent autoimmune diseases. However, some tumors benefit from these checkpoints, because these checkpoints can prevent the immune system from killing cancer cells. One such immune checkpoint is programmed cell death 1 (PD-1) protein, which  www.nature.com/scientificreports/ binds to its ligand PD-L1 and inhibits immune cell activities, including T cell activities. One strategy for cancer immunotherapy is to block these checkpoints to promote anti-cancer T-cell activities [30][31][32][33] . Immunotherapy such as targeting PD-1 pathway has improved overall survival months of several patients with metastatic cancers, including melanoma, head and neck cancer, renal cell carcinoma, non-small cell lung cancer (NSCLC), and colon cancer. However, there are many patients who do not respond to these treatments, and some patients who initially respond to the treatments, they might develop resistance or experience severe adverse events [34][35][36][37][38] . For this reason, further biomarkers of tumor cells such as PD-1 and PD-L1 and of tumor infiltrating immune cells such as T-cells and macrophages need to be established to develop new treatment strategies and identify the patients who can be treated by each drug or treatment strategy 39 . In kidney cancer, common immunotherapy drugs such as nivolumab and avelumab target PD-1, PD-L1, and PD-L2 pathways 40 . Anti PD-1 drugs targets T-cells directly, while anti-PD-L1 drugs target tumor cells directly, and they may also target tumor associated macrophages that express PD-L1. Several studies indicate an increase of INFγ production in the PD-1 inhibitors and other immune checkpoint blockade therapies that resulted in destruction of cancer cells [41][42][43] , and a relation between cancer immunotherapy improvement and an increase of INFγ expression has been observed 28 . Moreover, a correlation observed between an increase in INFγ gene expression and better progression-free survival in NSCLC and urothelial cancer patients treated with a PD-L1 inhibitor 44 .
Note, tumors in cluster ( CD4 < M� < CD8 ) have a high expression levels of INFG, the gene encoding INFγ , and PDCD1, the gene encoding PD-1, compared to the other clusters, and the expression levels of these genes are significantly positively correlated with the percentage of CD8+ T-cells in tumors. Importantly, it has been shown that INFγ boosts the CD8+ T-cells expansion 45 . Thus, patients in the cluster ( CD4 < M� < CD8 ) might respond to the PD-1 inhibitors. In addition, since there is not a strong correlation between PDCD1LG2 and CD274 expression levels and levels of INFG and PDCD1 genes, PD-L1 and PD-L2 inhibitors might not be as effective treatments as the PD-1 inhibitors for the patients in this cluster. Although Cluster ( CD8 < CD4 ≈ M� ) includes a high number of patients with lower grade and without tumor in the last follow up time, tumors in this cluster have lower levels of INFG and PDCD1, therefore patients in this cluster may not be a good candidate for anti PD-1 therapies.
Anti-angiogenic agent (AA) is one of the main treatments in the aggressive ccRCC 1 , because nutrients and oxygen are the main ingredients of the tumor growth which come from blood. Anti-angiogenics, also known angiogenesis inhibitors, are drugs that stop the growth of blood vessels (angiogenesis) that tumors need to grow 46 . A study of in vitro cell lines and in vivo mouse models of ccRCC shows that the recruitment of mast cells is related with increased ccRCC angiogenesis by modulating PI3K → AKT → GSK3β → AM signaling pathway 47 . Since Cluster ( CD8 < CD4 ≈ M� ) has the highest amount of mast cells compared to the other clusters, angiogenesis inhibitors might be a good treatment option for the patients in this cluster. Moreover, mast cells are suggested as an independent prognostic factor in some studies of ccRCC patients 48,49 . It has been observed that the number of mast cells is negatively correlated with 5-year survival 49 and positively correlated with grade, pT stage, and metastasis 50 . Contradicting these observations, a recent study of ccRCC patients shows that an increased mast cells infiltration is linked with better treatments' responses and survival 24 . We have similarly observed that the number of mast cells is inversely correlated with the grade of tumors (Fig. 3A,D), and the primary tumors of patients without tumors at the last time of follow up have higher percentages of mast cells than primary tumors of patients with tumor at the last time of follow up.
Kruger et al. 51 suggested RGS5 gene as a tumor associated antigenes (TAAs), and they observed over-expressed RGS5 level from a large scale analysis of ccRCC specimens. Another study found that RGS5 is strongly up-regulated in a broad variety of malignant cells and showed that RGS5 peptides might be a good candidate for designing cancer vaccines to target malignant cells and tumor vessels 52 . We found that patients with higher RGS5 levels have significantly higher percentages of NK cells, mast cells, and monocytes in their primary tumors (P-value < 0.01 ). Moreover, patients in Cluster ( CD8 < CD4 ≈ M� ) have the highest amount of RGS5 expression in their primary tumor. With the help of further investigation, RGS5 gene might be a good target for patients in this cluster. Further clinical and biological studies are required to test and validate all above mentioned suggestions.

Materials and methods
We estimated the percentage of tumor infiltrating immune cells in ccRCC tumors using CIBERSORTx deconvolution method that is based on the following linear model: where b, which is called mixture data, is the gene expression profile of the bulk tumor, and X is unknown cell proportions in b. A, which is called signature matrix, is the gene expression profile of cells.
In the first version of CIBERSORT, a machine learning technique, Nu-Support Vector Regression ( ν-SVR), is used to solve the problem (1) 53 . Matrix A in Eq. (1) is determined by a hyperplane with capturing the data points inside an ε-tube that is determined by support vectors (genes in signature matrix). SVR penalizes the data points outside the ε-tube, and a small value is used for ν that determines the lower bound of support vectors and the upper bound of training errors. Regression coefficients of ν-SVR method are the values of the vector X. However, the proportions are non-negative values, and their sum must be one. Therefore, negative coefficients are set to 0, and they normalize the remaining coefficients to sum to 1 53 . Newman et al. 18 have recently improved their method by adding batch correction modes to remove possible cross-platform variations between signature matrix and mixture data.
To investigate the immune variations in renal cancer, we downloaded TCGA data set 54 of gene expression profiles of 607 ccRCC primary tumors from UCSC Xena 55 to use as a mixture data b. We used LM22 signature  53 . We then estimated cell fractions in ccRCC tumors using CIBERSORTx B-mode to remove technical differences between LM22 signature matrix and TCGA RNA-seq data. Note, genes that are used to identify each type of immune cells in LM22 signature matrix can be found in the supplementary file of CIBERSORT paper 53 .
After we estimated cell proportions, we included only cases with CIBERSORTx P-value < 0.05 . We then applied unsupervised K-mean clustering algorithm to cluster tumors based on their percentage of immune cells. The K-mean algorithm separates samples in k-group of equal variance by minimizing the inertia (distance between samples in the clusters and center of the clusters). To determine the optimal number of clusters (k-value), we used elbow method to find the best value for k 56 .
We also collected clinical information of patients from cBioPortal 57 and dropped some patients due to missing clinical information and continued our analysis with 526 patients. Patients' characteristic are given in Table 1.
For statistical analyses, we used the non-parametric Mann-Whitney-Wilcoxon (MWW) test between groups of continuous variables, because values in the comparison groups are not normally distributed and there are different numbers of patients in the comparison groups. MWW tests whether the values in one of two comparison groups is significantly larger than the other 58 . We also used chi-squared test to determine whether there is a statistically significant difference between the frequencies of the categorical variables. Stars in the figures show the significance levels where, ns: 0.05 < P ≤ 1 , *: 0.01 < P ≤ 0.05 , **: 0.001 < P ≤ 0.01 , ***: 0.0001 < P ≤ 0.001 , ****:P ≤ 0.0001.

Ethics.
No ethical approval was required for this study.

Data availability
The TCGA data 54 underlying this article are available at https ://www.cbiop ortal .org/datas ets 57 and https ://xenab rowse r.net/datap ages/ 55 . Table 1. Patients' characteristics. Sub-tables indicate the number of patients in each category. Differences in the numbers are due to missing information for some patients.