Comprehensive immunogenomic landscape analysis of prognosis-related genes in head and neck cancer

Head and neck cancer is the sixth most common malignancy around the world, and 90% of cases are squamous cell carcinomas. In this study, we performed a systematic investigation of the immunogenomic landscape to identify prognostic biomarkers for head and neck squamous cell carcinoma (HNSCC). We analyzed the expression profiles of immune‐related genes (IRGs) and clinical characteristics by interrogating RNA-seq data from 527 HNSCC patients in the cancer genome atlas (TCGA) dataset, including 41 HPV+ and 486 HPV− samples. We found that differentially expressed immune genes were closely associated with patient prognosis in HNSCC by comparing the differences in gene expression between cancer and normal samples and performing survival analysis. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses were performed to annotate the biological functions of the differentially expressed immunogenomic prognosis-related genes. Two additional cohorts from the Oncomine database were used for validation. 65, 56 differentially expressed IRGs was associated with clinical prognosis in total and HPV- samples, respectively. Furthermore, we extracted 10, 11 prognosis-related IRGs from 65, 56 differentially expressed IRGs, respectively. They were significantly correlated with clinical prognosis and used to construct the prognosis prediction models. The multivariable ROC curves (specifically, the AUC) were used to measure the accuracy of the prognostic models. These genes were mainly enriched in several gene ontology (GO) terms related to immunocyte migration and receptor and ligand activity. KEGG pathway analysis revealed enrichment of pathways related to cytokine−cytokine receptor interactions, which are primarily involved in biological processes. In addition, we identified 63 differentially expressed transcription factors (TFs) from 4784 differentially expressed genes, and 16 edges involving 18 nodes were formed in the regulatory network between differentially expressed TFs and the high-risk survival-associated IRGs. B cell and CD4 T cell infiltration levels were significantly negatively correlated with the expression of prognosis-related immune genes regardless of HPV status. In conclusion, this comprehensive analysis identified the prognostic IRGs as potential biomarkers, and the model generated in this study may enable an accurate prediction of survival.

locally advanced HNSCC develop recurrence or distant metastases, such that the 5-year overall survival does not usually exceed 60% 6 . The locoregionally advanced and distantly metastatic HNSCC cases that are unsuitable for surgery or radiotherapy are significantly associated with a poor prognosis, with an expected survival on the order of 6-10 months 7 .
Extracted from TCGA data, protein-coding genes that paint a molecular portrait of the disease can be helpful tools and can be tested as biomarkers. Many studies analyzing the cellular landscape indicated that several genes were differentially expressed between tumor and healthy tissues. However, the differential expression status of the IRGs in HNSCC has not been revealed by comprehensive analysis.
In recent years, it has been well established that the immune system plays a pivotal role in the control of tumor growth, and it has been suggested that potentially invading cancer cells are held in equilibrium via the immune system 8 . Cutting-edge immunotherapy treatments, which are beneficial to recurrent/metastatic (R/M) HNSCC patients, have recently revolutionized the treatment of multiple cancers 9 . Some clinical trials have demonstrated that immune checkpoint therapy is effective for R/M HNSCC and has less toxicity than other therapies 10 . Interestingly, one of the major advantages of immunotherapy over other forms of systemic cancer therapy is that responses can be quite durable-with clinical benefit sometimes measured in years. Long-term follow-up data for survival outcomes following immunotherapy for R/M HNSCC demonstrated that pembrolizumab exhibited durable antitumor activity and a high survival rate in advanced HNSCC patients. The overall response rate was 18%, while 85% of responses lasted 6 months or more, and 71% of responses lasted a year or more 11 . Many studies have emphasized that more immunological biomarkers need to be discovered to provide prognostic information  www.nature.com/scientificreports www.nature.com/scientificreports/ and facilitate clinical decision-making. Therefore, identifying immune-related biomarkers to manipulate the immune response and achieve greater clinical benefit for R/M HNSCC patients is key.
In this study, we combined the expression profiles of immune-related genes (IRGs) with clinical information by comparing the differences in gene expression between normal and tumor tissues and performing Cox regression analysis. Furthermore, bioinformatics analysis was utilized to explore the intrinsic regulatory mechanisms of immune-related genes. Our findings reveal the potential clinical application of IRGs as biomarkers in prognosis prediction, and IRGs may represent therapeutic targets for HNSCC immunotherapy.

Results
Identification of differentially expressed IRGs. In the present study, we downloaded a total of 56753 genes information from TCGA database including tumor and normal samples. 4784 differentially expressed genes were identified, 3603 of which were upregulated and 1181 of which were downregulated in HNSCC patients (Fig. 1a,b). Furthermore, 399 differentially expressed IRGs were identified from a subset of the 4784 genes, 304 of which were upregulated and 95 of which were downregulated in HNSCC patients (Table 1; Fig. 1c,d). In addition, the functions of the intersecting genes (399 differentially expressed IRGs) were predicted. In the Gene Ontology (GO) and KEGG pathway analyses, 1550 terms and 94 pathways were identified. The top three GO terms were "leukocyte migration (GO:0050900)", "regulation of immune effector process (GO:0002697)" and "regulation of inflammatory response (GO:0050727)" in terms of biological processes; "extracellular matrix (GO:0030198)", "side of membrane (GO:0098552)" and "collagen-containing extracellular matrix (GO:0062023)" in terms of cellular components; and "receptor ligand activity (GO:0048018)", " receptor regulator activity (GO:0030545)" and "cytokine activity (GO:0005125)" in terms of molecular functions (Fig. 2a,c,e). In the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analysis, cytokine-cytokine receptor interaction (hsa04060) was the pathway most often enriched by the differentially expressed IRGs (Fig. 2b,d,f).

Identification of survival-associated IRGs.
To explore the relationship between IRGs and prognosis, we identified 65, 56 IRGs that were significantly associated with overall survival (OS) from the total and HPV-HNSCC patients, respectively (P < 0.05; Table 2A,B; Fig. 3a,b). Moreover, we divided 65 IRGs into 23 high-risk (HR > 1) genes and 42 low-risk (HR < 1) genes (Fig. 3a). We also identified 20 high-risk (HR > 1) genes and 36 low-risk (HR < 1) genes from 56 IRGs (Fig. 3b). Similar to the results from the previous enrichment analysis of differentially expressed IRGs, 65 survival-associated IRGs were mainly involved in several Gene Ontology (GO) terms related to immunocyte migration and the activity of receptors and ligands (Fig. 4a,c,e). Additionally, KEGG pathway analysis indicated that survival-associated IRGs were enriched in "cytokine-cytokine receptor interaction (hsa04060)" (Fig. 4b,d,f). Interestingly, the results of 56 survival-associated IRGs enrichment analysis were almost consistent with our earlier research. GO terms immunocyte migration and the activity of receptors and ligands were mainly enriched (Fig. 5a,c,e). KEGG pathway "cytokine-cytokine receptor interaction (hsa04060)" was also enriched (Fig. 5b,d,f). transcription factor (tf) regulatory network. We found 63 differentially expressed transcription factors (TFs) within the genes that were differentially expressed between HNSCC patients and normal patients, 46 of which were upregulated and 17 of which were downregulated (Fig. 6a,b). To investigate the relationship between the differentially expressed TFs and 23 high-risk survival-associated IRGs, we constructed a regulatory network based on them. In this module (Fig. 6c), 16 edges involving 18 nodes were formed. SNA12 was remarkable for having the most connections with other high-risk genes, while BIRC5 was remarkable for having the most connections with other TFs.
immune gene-related prognosis model. To establish a prognosis prediction model based on survival-associated IRG expression, prognostic genes for HNSCC were identified by multivariable Cox regression analysis. Finally, to avoid false positive results, 10 genes were proven to be predictive of clinical outcomes via multivariable Cox regression analysis (P < 0.01) and constructed the optimal model according to Akaike information criterion. The model corresponding to minimum AIC value (AIC = 1,859.31) represented the target model. SEMA3G, GNRH1 and ZAP70 were positively correlated with OS. PLAU, SFTPA2, CCL26, DKK1, GAST, PDGFA and STC1 were negatively correlated with OS (Table 3A). Thus, the expression data of these prognostic genes and their coefficients were used to develop a gene-based prognosis prediction model, with a formula as . The 10-gene-based model was used to calculate a risk score for each sample as described above (Fig. 7a,c,e). All patients were divided into a high-risk group (n = 237) or a low-risk group (n = 238) according to the median risk score after removing some patients with missing clinical characteristics data. According to the Kaplan-Meier survival analysis, the overall survival time was significantly different between the high-risk and low-risk groups, and the five-year survival rates were 31.2% and 66.8%, Table 1. Identification of the differentially expressed immune-related genes (The logFC means log2|fold change | , log F/C > 1 indicated that the expression of genes were up-regulated in HNSCC patients, while log F/C < 1 indicated that expression of genes were down-regulated in HNSCC patients). The FDR means the false discovery rate and the FDR value is less than 0.05 as the filter criterion. Tumor mean represented the average gene expression in tumor samples and normal mean represented the average gene expression in normal samples. www.nature.com/scientificreports www.nature.com/scientificreports/ respectively (Fig. 8a). The area under the curve (AUC) of the receiver operating characteristic (ROC) curve was 0.750, suggesting that the model could predict the survival outcomes of HNSCC patients (Fig. 8c).
For HPV-HNSCC patients, we also identified 11 genes to predict clinical outcomes and establish the prognosis predict model via multivariable Cox regression analysis (P < 0.01). The same as the above methods and standards, the model corresponding to minimum AIC value (AIC = 1814.05) represented the optimal model. SEMA3G, GNRH1, TNFRSF4 and ZAP70 were positively correlated with OS. PLAU, SH2D1A, CCL26, DKK1, GAST, PDGFA and STC1 were negatively correlated with OS (Table 3B). Then, the 11-gene-based model was also used to calculate a risk score for each sample with an alike formula (Fig. 7b,d,f). Similarly, HPV-patients were divided into a high-risk group (n = 222) or a low-risk group (n = 223) and the five-year survival rates were 28.5% and 63.1%, respectively (Fig. 8b). The area under the curve (AUC) of the receiver operating characteristic (ROC) curve was 0.743, suggesting that the model could predict the survival outcomes of HPV-HNSCC patients (Fig. 8d).
The relationship between IRGs and clinical factors. The relationships between the single genes and clinical factors were analyzed. The results indicated that the gene expression level of PDGFRB was significantly higher in males than in females whether total HNSCC patients (Fig. 9a) or HPV-HNSCC patients (Fig. 10a). AR was significantly higher in stage III and IV disease than in stage I and II disease, ICOS was significantly higher in T1-2 disease than in T3-4 disease whether total HNSCC patients (Fig. 9c,d) or HPV-HNSCC patients (Fig. 10c,d). Among the total HNSCC patients, NR3C2 was significantly higher in grade 1 and 2 disease than in grade 3 disease (Fig. 9b). Among the HPV-HNSCC patients, DEFB1 was significantly higher in grade 1 and 2 disease than in grade 3 disease, PAEP was significantly higher in N0 disease than in N1-3 disease (Fig. 10b,e). the relevance analysis of risk score and immune cell infiltration. To reveal whether the immune-related genome can alter the tumor immune microenvironment, we analyzed the relationship between the risk score of IRGs and immune cell infiltration. B cell and CD4 T cell infiltration levels were significantly negatively correlated with the risk score in both total HNSCC patients (Fig. 11a,c) and HPV-patients (Fig. 11b,d) (P < 0.05). In total HNSCC patients, the infiltration of CD8 T cells, dendritic cells and macrophages were not statistically significant (P > 0.05) (Fig. 11e,g,i), while CD8 T cells, dendritic cells and macrophages infiltration levels were significantly negatively correlated with the risk score in the HPV-patients (P < 0.05) (Fig. 11f,h,j). The infiltration of neutrophils was not statistically significant in both total and HPV-HNSCC patients (P > 0.05) (Fig. 11k,l). independent survival analysis. The univariable and multivariable Cox regression analyses suggested that the risk score could become an independent predictor for OS after adjusting for other parameters, including TN stage and risk score whether total HNSCC patients (Fig. 12a,b) or HPV-HNSCC patients (Fig. 12c,d) (P < 0.05).   (31 samples). We extracted the expression levels of the survival-related IRGs, relevant clinical characteristics and follow-up times, and we performed survival analysis and created Kaplan-Meier survival curves (Fig. 13). As a result, 10 of the survival-related IRGs we identified were significantly associated with clinical prognosis according to the survival curves. SEMA3G, GNRH1 and ZAP70 were negatively correlated with OS, whereas PLAU, SFTPA2, CCL26, DKK1, GAST, PDGFA and STC1 were positively correlated with OS. Similarly, we also calculated the risk scores of 112 samples and created survival curves (Fig. 13). The results from this verification cohort were consistent with the results we obtained before, indicating their reliability and repeatability.  Table 2. Relationships between the expression of immune-related genes and overall survival in head and neck squamous cell carcinoma. (A) The prognosis-related genes in total HNSCC patients, (B) The prognosis-related genes in HPV-HNSCC patients. Immune-related genes were divided into high risk and low risk the prognosisrelated genes via HR value (HR > 1 indicated high risk and HR < 1 indicated low risk). www.nature.com/scientificreports www.nature.com/scientificreports/

Discussion
The head and neck squamous cell carcinoma (HNSCC) is a common cancer which has attracted considerable and increasing attention 12 . While many patients with locally advanced HNSCC could be treated by surgery, radiation, chemotherapy and the combination application, those who develop recurrent/metastatic disease (R/M) has a median overall survival of less than a year 13 . Therefore, because second-line treatment options for advanced HNSCC are limited, immunotherapy has attracted increasing attention. However, depending on the different patterns of tumor-infiltrating immunocytes, individuals respond significantly differently after immunotherapy.  A number of studies have found that differential expression of IRGs affects tumor prognosis and response to immunotherapy, probably because these genes cause different levels of infiltration of various immune cell subtypes in tumors [14][15][16][17] . The interaction of the tumor with its microenvironment is crucial in the development and progression of the tumor. A comprehensive analysis of the expression of immune-related genes and the functional roles of different subsets of tumor-infiltrating cells in the tumor immune microenvironment could improve our knowledge of immunology and define subgroups of patients who are more likely to respond to immunotherapy. Therefore, in this study, we identified the survival-associated IRGs that were significantly related with the development and progression of HNSCC. In the functional enrichment analysis, one significant pathway identified was related to cytokine-cytokine receptor interactions in HNSCC, and the main enriched GO terms were immunocyte migration and activity of the receptor and ligand regardless of HPV status. The immune cell infiltration was mainly mediated by chemokine/receptor expression networks and cancer genetic alterations in tumor tissue via a systematic analysis of multiple cancers 18 . As expected, biological functions of the identified genes were significantly associated with inflammation progression. The results suggested that these differentially expressed genes were mostly enriched in inflammation-related terms and pathways.
Based on this result, we selected 10 genes (PLAU, SFTPA2, CCL26, SEMA3G, DKK1, GAST, GNRH1, PDGFA, ZAP70, and STC1) and 11 genes (SEMA3G, GNRH1, TNFRSF4, ZAP70, PLAU, SH2D1A, CCL26, DKK1, GAST, PDGFA and STC1) that were closely related to clinical prognosis to construct two prognostic prediction models to assess potential clinical outcomes for the total HNSCC patients and HPV-HNSCC patients. Notably, GNRH1 has been suggested as a marker of the metastatic spread of gynecological cancer 19 . PDGFA, a member of the platelet-derived growth factor (PDGF) family, may play a crucial role in the composition of the immune microenvironment 20 . PLAU, one of the major proteolytic enzymes involved in the degradation of extracellular matrix, has been demonstrated to play a critical role in tissue remodeling and migration in the developmental of cancer and in tumorigenesis 21 . Some studies indicated that PLAU was a marker to predict OS in HNSCC patients 22 . Stanniocalcin-1 (STC1) is a secreted glycoprotein implicated in several pathologies, including inflammation and www.nature.com/scientificreports www.nature.com/scientificreports/ cancer. Several studies have shown that STC1 is associated with cancer development 23,24 . It was verified that STC1 could accelerate tumor growth and reduce disease-free survival in mice. SEMA3G is a member of the class 3 semaphorin family originally characterized in axonal guidance 25 . Semaphorins have been shown to play multiple roles in normal and pathologic angiogenesis by acting on their receptors, plexins and neuropilins 26 . Methylation of the SFTPA2 promoter represents a potential biomarker for lung cancer diagnosis. The SFTPA2 DNA methylation profile was used as a potential tool to monitor disease progression and immunity 27 . The immune-related gene ZAP70 was associated with an increased risk of developing virally mediated head and neck squamous cell carcinoma. Mutations in many of these genes have previously been implicated in cancer risk, viral host-response, or epithelial immunity 28 . Gastrin is a growth factor of the gastrointestinal mucosa, and its role in gastrointestinal tumorigenesis is well studied. High levels of gastrin have been correlated with the poor prognosis of lung cancer patients. Gastrin-releasing peptide (GRP) signaling appears to mediate the autocrine growth of human squamous cell carcinoma of the head and neck 29,30 . Dickkopf-1 (DKK-1) is a secreted protein, and the expression and DKK-1 is different in various cancers. The methylation of DKK1 may be considered a prognostic marker in oral cancer 31 . DKK1 knockdown increased cellular migration and invasiveness in oral cancer cells 32 . High expression of DKK-1 was associated with poor prognosis, and this suggests that DKK-1 may be a useful molecular marker in breast cancer 33 . OChemokine ligand 26 (CCL26) levels were elevated in and positively correlated with stage III and IV colorectal cancer (CRC) tissues and were associated with a poor prognosis in CRC patients 34 . TNFRSF4 (also known as OX40 or CD134) is a member of the tumor necrosis factor receptor superfamily expressed on activated T cells 35 . TNFRSF4 has also been shown to promote T cell survival, proliferation, and memory, enhance cytokine secretion, thus further enhance antitumor immunity 36 . Therefore, the quality of T cells is as important as T cell number in response to HMSCC immunotherapy. SH2 domains are commonly found in adapter proteins that aid in the signal transduction of receptor tyrosine kinase pathways 37 . SH2D1A is also a SH2 domain-containing protein, which mutations caused the X-linked lymphoproliferative syndrome (XLP) and was associated with B-cell lymphomas 38 . These findings indicate the potential clinical application of IRGs as biomarkers in prognosis prediction and the promise of HNSCC immunotherapy. Our study will also provide a bioinformatics evidence for the prognosis prediction of HPV-HNSCC patients.
In univariable and multivariable Cox regression analyses, interestingly, we found that the expression of prognostic immune-related genes (CD79A, NR3C2, and PDGFRB) was significantly correlated with the sex of HNSCC patients. The results indicated that the immune microenvironment and the levels of infiltration by immune cells may be different in male and female HNSCC patients. Thus, male and female patients respond differently to immunotherapy. Recent findings reported that immune checkpoint inhibitors were twice as effective as standard cancer therapies in the treatment of men with advanced solid tumors compared to their female counterparts 39,40 . Immune checkpoint inhibitors can improve overall survival for patients of both sexes with some types of advanced cancers, but men have a greater treatment effect from these drugs versus control treatments than do women 41 . Sex differences in congenital and adaptive immune responses are known, and women generally have a stronger immune response than men. Therefore, different treatment strategies are adopted for patients of different    HNSCC represents a group of tumors occurring at various sites, including the oral mucosa and the palatine tonsils. Adding to this diversity is the recent observation that a proportion of these cancers, notably tonsillar carcinomas, are associated with human papillomavirus (HPV) infection 41 . HPV has been implicated in the etiology of a subset of HNSCCs, which often arise in younger patients without a history of alcohol or tobacco use 42 . Approximately 30% of HNSCC tumors are HPV + , and despite late-stage presentation, they often have a better prognosis and response to therapy 45 . A high level of CD8 + T-cell infiltration might be an important factor contributing to the improved survival of HPV + HNSCC patients 46 . In current study, we also identified 13 differentially expressed IRGs that were significantly associated with overall survival (OS) of the HPV + HNSCC patients (Tables S2, 3; Fig. S1). However, the number of HPV + HNSCC samples are not enough for prognosis and immune cell infiltration analyses.
Tumor-infiltrating lymphocytes (TILs) are generally thought to represent a host immune response directed against antigens expressed on tumor cells, and a high density of TILs has been identified as a favorable marker in HNSCC 47 . Studies of HNSCC and other cancers have suggested a beneficial effect of B cells on outcome [48][49][50] . However, little is known about the role of TILs in the development of head and neck squamous cell carcinoma. No research has reported the relationship between the expression of IRGs and TIL status. Thus, knowledge of immune infiltration is important for an in-depth understanding of tumor and immune interactions. We investigated the relationship between prognosis-related immune gene expression and immune cell infiltration to determine the status of the HNSCC immune microenvironment and found that B cell and CD4 T cell infiltration levels were significantly negatively correlated with the expression of prognosis-related immune genes in total HNSCC patients. However, B cell, CD4 T cell, CD8 T cell, dendritic cell and macrophage infiltration levels were significantly negatively correlated with the risk score of the HPV-patients. IRGs were significantly associated with B cell infiltration in lung adenocarcinoma (LUAD), and these genes are involved in tumor immunity and may play an important role in the prognosis of patients 51 . Lin et al. demonstrated that an IRG-based risk score www.nature.com/scientificreports www.nature.com/scientificreports/ outcomes. We will aim to validate the molecular mechanism and further in-depth analysis the impact of different HPV status on the prognosis of HNSCC patients in future studies, particularly HPV + HNSCC patients.
Inevitably, there were several limitations in our study. The current results lack prognostic analysis of HPV + HNSCC patients due to small number of HPV + samples, and the lack of validation in a prospective clinical trial is also a limitation of the study. Moreover, the mechanism by which the expression of the prognosis-related IRGs affects the outcome of HNSCC patients remains unclear. www.nature.com/scientificreports www.nature.com/scientificreports/ In conclusion, we arrived at a more comprehensive understanding of the TME and created a list of prognostic immune-related genes with the potential to become prognostic biomarkers.

Materials and Methods
clinical samples and immune gene data. All data in this study were obtained from a public database.
The transcriptome profiling data of HNSCC samples were downloaded from the TCGA data portal (https:// portal.gdc.cancer.gov/), which contained data from 501 primary HNSCC and 44 nontumor samples. The clinical information of 527 HNSCC patients was extracted and downloaded for further analysis. We also identified immune-related genes via the Immunology Database and Analysis Portal (ImmPort) (https://www.immport. org/) (Table S1). ImmPort is an archival repository and dissemination platform for clinical and molecular datasets. ImmPort is an important source of raw data and protocols from clinical trials, mechanistic studies, and novel methods for cellular and molecular measurements 53 . In addition, ImmPort is also a database that updates real-time data accurately and provides a list of IRGs that are actively involved in tumor immunological processes for cancer research. Differential gene analysis. We performed differential gene analysis on all transcriptome profiling data using the R software limma package (http://www.bioconductor.org/packages/release/bioc/html/limma.html), with a false discovery rate (FDR) < 0.05 and log2 |fold change | > 1 as the cutoff values. Heatmaps were generated using the pheatmap package in R. Then, we extracted differentially expressed IRGs from the intersection of immune genes and all differentially expressed genes. To explore the potential molecular mechanisms of Figure 11. The relationships between infiltration abundances of six types of immune cells and the risk score in different groups of HNSCC patients. B cell and CD4 T cell infiltration levels were significantly negatively correlated with the risk score in both total HNSCC patients (a,c) and HPV-patients (b,d). In total HNSCC patients, the infiltration of CD8 T cells (e), dendritic cells (g) and macrophages (i) were not statistically significant (P > 0.05) while CD8 T cells (f), dendritic cells (h) and macrophages (j) infiltration levels were significantly negatively correlated with the risk score of the HPV-patients. The infiltration of neutrophils was not statistically significant (P > 0.05) in both total (k) and HPV-(l) HNSCC patients. (2020) 10:6395 | https://doi.org/10.1038/s41598-020-63148-8 www.nature.com/scientificreports www.nature.com/scientificreports/ differentially expressed IRGs, functional enrichment analysis was performed via GO and KEGG pathway analyses 54-56 using the R software clusterprofiler package 57 .

Survival-related IRGs and survival analysis.
We used Perl software to analyze the clinical characteristics and follow-up data downloaded from TCGA and chose overall survival (OS) as the primary endpoint. Survival-related genes were selected by univariable COX regression analysis (FDR < 0.05). We made proportional hazards assumptions based on Schoenfeld residuals (phtest) for the COX regression model. The significance value for the overall test of proportional hazards is less than 0.05 (P < 0.05). Hazards ratio (HR) is the ratio of tumor samples and normal samples IRGs expression. We defined the high-risk IRGs (HR > 1) and low-risk IRGs (HR < 1) with HR = 1 as a cutoff. The Kaplan-Meier survival curve was drawn using the R software survival package according to a significance filter of P < 0.05. Functional enrichment analysis was also performed on survival-related IRGs that were significantly associated with OS.
construction of the immune gene-related prognostic model. The expression data of these survival-related genes and their coefficients were used to develop a gene-based prognosis prediction model, and Akaike information criterion (AIC) was used to build the model in order to avoid the data of overfitting, with the minimum AIC value representing the target model 58 . The formula used is as follows: = ∑ × = risk score c oefgenei expgenei i 1 n 59 . Multivariable Cox regression analysis was used to illustrate the correlation of risk scores with patient overall survival (OS) and to identify potential prognostic genes. Patients were divided into high-and low-risk groups based on the median risk score value. correlation analysis of the clinical data. Relationships were analyzed between single genes and clinical factors via the R software beeswarm package. In addition, we also used clinical characteristics for univariable and multivariable Cox regression analyses with the R software survival package. The risk scores were divided into high-and low-risk groups based on median risk score, and ages were divided into ≤65 group and >65 group. The grades were divided into G1-2 and G3, stages were divided into I-II and III-IV groups, stage T were divided into T1-2 and T3-4 groups, stage N were divided into N0 and N1-3 groups and stage M were divided into M0 and M1 groups. We selected age (>65), sex (female), grade (G1-2), stage (I-II), T (1-2), M (M0) and N (N0) as the reference for each group. Risk score P-values of less than 0.001 were considered statistically significant.
Clinical relevance of tumor immune infiltration. We downloaded the immune infiltration levels of HNSCC patients using the Tumor IMmune Estimation Resource (TIMER) (http://cistrome.org/TIMER/), which is a web resource for the systematic evaluation of the clinical impact of different immune cells in diverse cancer types. We used this resource to investigate possible associations between the abundance of six subtypes of tumor-infiltrating immune cells (B cells, CD4 T cells, CD8 T cells, macrophages, neutrophils, and dendritic cells) and patient prognosis. Based on the immune gene-related prognostic model, we performed a correlation analysis combining the risk score and immune cell infiltration level of each sample via R software. transcription factor (tf) regulatory network. To explore the interactions between high-risk survival-associated IRGs and transcription factors (TFs), we obtained data on 318 TFs from the Cistrome online database (https://cistrome.org/). Cistrome is a comprehensive database for cancer transcription factor (TF) www.nature.com/scientificreports www.nature.com/scientificreports/ targets and provides regulatory links between TFs and transcriptomes. A protein-protein interaction (PPI) network was constructed based on data gleaned and displays many direct and indirect interactions with genes. The resulting PPI network was analyzed with Cytoscape software version 3.7.2. We extracted differential TFs to construct the regulatory network of the high-risk survival-associated IRGs (HR > 1) and potential TFs. Figure 13. Validation of TCGA results with Oncomine data. Kaplan-Meier survival curves were used to verify the relationship between genes (a). PDGFA, GAST and PLAU; (b) CCL26, SEMA3G and SFTPA2; (c) DKK1, GNRH1, ZAP70 and STC1) and risk scores (d) with overall survival. Red represents the high-risk group, and blue represents the low-risk group.