Development of a novel immune-related genes prognostic signature for osteosarcoma

Immune-related genes (IRGs) are responsible for osteosarcoma (OS) initiation and development. We aimed to develop an optimal IRGs-based signature to assess of OS prognosis. Sample gene expression profiles and clinical information were downloaded from the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) and Genotype-Tissue Expression (GTEx) databases. IRGs were obtained from the ImmPort database. R software was used to screen differentially expressed IRGs (DEIRGs) and functional correlation analysis. DEIRGs were analyzed by univariate Cox regression and iterative LASSO Cox regression analysis to develop an optimal prognostic signature, and the signature was further verified by independent cohort (GSE39055) and clinical correlation analysis. The analyses yielded 604 DEIRGs and 10 hub IRGs. A prognostic signature consisting of 13 IRGs was constructed, which strikingly correlated with OS overall survival and distant metastasis (p < 0.05, p < 0.01), and clinical subgroup showed that the signature’s prognostic ability was independent of clinicopathological factors. Univariate and multivariate Cox regression analyses also supported its prognostic value. In conclusion, we developed an IRGs signature that is a prognostic indicator in OS patients, and the signature might serve as potential prognostic indicator to identify outcome of OS and facilitate personalized management of the high-risk patients.

Identification and assessment of the prognostic signature. To identify the optimal prognostic signature of OS based on IRGs, 82 prognostic-associated IRGs were identified by univariate Cox regression analysis of DEIRGs. Further, we identified the optimal prognostic signature that consisted of 13 prognosis-associated  Table 2). ROC curve results showed that the accuracy of this signature in diagnosing OS prognosis was high (Fig. 3B, AUC = 0.918). The Kaplan-Meier curve indicated that the overall survival of patients in the high-risk group was markedly worse than that in the low-risk group (Fig. 3C, p < 0.001). According to the optimal signature, we obtained the risk score distribution (Fig. 4A), the survival status (Fig. 4B), and the expression characteristics of the immune genes of OS (Fig. 4C). Compared to the low-risk group, the high-risk group had more deaths. In addition, the expression levels of GNRH1, VEGFA, TNFRSF11B, GAL, STC2, BRAF, BMP8A, and CORT were higher in the high-risk group, whereas patients in the low-risk group expressed higher levels of PSMD10, TNFRSF21, GRN, VAV1, and SDC3.

Comparison of the IRGs signature with other known prognostic biomarkers and verification in independent cohort.
To determine whether the IRGs signature has a better diagnostic capacity for OS patient survival, we conducted receiver operating characteristic (ROC) analysis of the IRG signature along with other known prognostic biomarkers (SP140, MALAT1, UCA1, and MIR191) in the training cohort. The results showed that the area under the curve (AUC) of the IRGs signature was increased compared to that for other known biomarkers (Fig. 5A), indicating that the IRG signature was a better prognostic biomarker and provided better stability and reliability in predicting the survival of OS patients. To further examine the prognostic value of the IRG signature, we conducted the ROC analysis in another independent cohort (GSE39055). The results showed that the AUCs were 0.92, 0.93, and 0.89 at 1, 3, and 5 years, respectively (Fig. 5B), suggesting that the IRG can also predict the survival of OS patients in other independent cohorts.
Independence of the IRGs signature in survival prediction from clinicopathological factors. An important feature of a good prognostic biomarker is that it should be independent of clinicopathological prognostic factors. Clinicopathological characteristics, such as the patient's age, sex, and metastasis, are also considered to be the main factors that determine the prognosis of OS patients. To evaluate the independence and applicability of the IRGs signature, we regrouped patients according to different clinicopathological characteristics and performed Kaplan-Meier survival analysis. The Kaplan-Meier curve showed that regardless of sex, age, and metastasis, the survival time of OS patients in the low-risk group was significantly prolonged (p < 0.05, Fig. 6A-C). All of results indicated that the IRGs signature showed satisfactory applicability when grouping patients according to different clinicopathological characteristics. Univariate and multivariate COX regression also suggested that the signature is an independent indicator for predicting the prognosis of OS patients (Table 3).

Relationship between the prognostic signature and clinical characteristics. The relationship
between clinical characteristics, such as metastasis, age, grade, and the risk score based on the prognosis-associated IRGs signature, was analyzed to validate the accuracy of the prognostic signature further. The results showed that metastasis groups had a significantly higher risk score than non-metastasis groups (Fig. 7C, p = 0.001). However, no significant association was observed between age (Fig. 7A, p = 0.531), sex (Fig. 7B, p = 0.485), and risk score.

Discussion
OS is the most common bone malignancy in children and adolescents, and it is also one of the main causes of cancer-related deaths in this age group 18 . Evidence demonstrates that the immune response defines the tumor's microenvironment. In particularly, immune cell disorders often cooccur with tumors and are considered an essential driver of OS development 19,20 . In this study, we analyzed the DEIRGs of the OS and control samples from TARGET and GTEx databases to identify new prognostic biomarkers by constructing a prognostic IRGs signature. Related studies show that chemotaxis, adhesion of leukocyte, and innate immunity are dysfunctional in the OS microenvironment, thereby reducing the immune response to OS cells [21][22][23] . PI3K/AKT signaling pathway 24 , MAPK signaling pathway 25 and JAK-STAT signaling pathway 26 have been extensively studied in the OS. C-C motif chemokine ligand 5 High CCL5 expression is associated with osteosarcoma metastasis and poor prognosis of patients with osteosarcoma 3 CCR5 C-C motif chemokine receptor 5 CCR5 controls the proliferation or differentiation of osteosarcoma 4 GAL Galanin and GMAP prepropeptide The overexpression of Gal-1 is well established in many types of cancer progression like osteosarcoma, breast, lung, prostate, melanoma, etc 5 S1PR1 Sphingosine-1-phosphate receptor 1 Downregulated S1PR1 suppresses osteosarcoma metastasis and proliferation 6 CXCR4 C-X-C motif chemokine receptor 4 CXCR4-mediated osteosarcoma growth and pulmonary metastasis 7 CXCL12 C-X-C motif chemokine ligand 12 CXCL12 plays a critical role in mediating tumor progression and the immune response in osteosarcoma 8 CXCR3 C-X-C motif chemokine receptor 3 CXCR3 correlates with immune infiltration and predicts poor survival in osteosarcoma 9 CXCL10 C-X-C motif chemokine ligand 10 CXCL10 plays an important role in cancer and autoimmunity 10 OPRL1 Opioid-related nociceptin receptor 1 OPRL1 plays a key role of pain and injury perception Scientific Reports | (2020) 10:18402 | https://doi.org/10.1038/s41598-020-75573-w www.nature.com/scientificreports/ Furthermore, the activation of these signaling pathways is strongly linked to the growth and metastasis of OS cells. Although natural killer cell-mediated cytotoxicity is the host's first-line anti-cancer defens 27 , the immune response is a seemingly double-edged sword in the OS microenvironment, as a dysregulated immune response is conducive to the occurrence and development of tumors.
In total, in our study, we obtained 604 DEIRGs. Of note, we identified 10 hub IRGs, namely CXCR3, CXCR4, CCR5, CCL5, CXCL10, CXCL12, CXCL16, OPRL1, S1PR1, and GAL. Among them, CXCR3 28 , CXCR4 29 , CCR5 30 , CCL5 31 , CXCL16 32 , CXCL10 33 , CXCL12 34 and GAL 35 have been widely studied in OS, and are involved in the occurrence, metastasis, and angiogenesis of OS. OPRL1 encodes proteins that are endogenous opioid-related neuropeptides and nociceptin/orphanin receptors, which plays a key role in pain perception and nociception 36,37 . The high expression of OPRL1 in OS may be related to cancerous pain. The coding product of the SIPR1 gene is a receptor protein that is similar to the G-protein-coupled receptor. When SIPR1 was combined with ligand S1P, the growth, invasion, and metastasis of lung cancer, ovarian cancer, and colon cancer are enhanced [38][39][40] . Hence, we can speculate that SIPR1 is pivotal in OS. Considering the similarity between molecular functions and cell components of hub IRGs, and through the ranking of semantic similarity, we discovered that CCR5, CXCL12 www.nature.com/scientificreports/ and CXCR4 are the most closely related genes. CCR5, CXCL12 and CXCR4 genes encode chemokine receptors or ligands, which plays a vital part role in the initiation and growth of OS 29,41,42 . These findings further support the reliability of our study. Previous research has shown that IRGs are closely related to OS metastasis and prognosis 43 . For example, Koirala et al. 44 found that immune cell infiltration and PD-L1 expression in the tumor microenvironment were independent risk factors for OS. Li Bo et al. 45 reported that CXC12 acts as a driver in OS metastasis and immune response, and knocking down CXC12 could effectively inhibit OS progression. Moreover, IRGs signatures have attracted widespread attention and have been used to predict metastasis and prognosis of different tumors [46][47][48] . Therefore, in order to further explore the value of IRGs in OS prognosis, we constructed a prognostic signature consisting of 13 prognostic-associated DEIRGs, which has a high diagnostic prognostic efficacy. The high expression lever of GNRH1 49 , BRAF 50 , PSMD10 51 and VEGFA 52 closely correlated with the growth, metastasis, and angiogenesis of OS. The high expression of GAL 53,54 , TNFRSF11B 55 and STC2 56 are linked to prostate cancer and colorectal cancer development and a worse prognosis. The abnormally high expression of BMP8A is an independent factor for the progression and poor prognosis of thyroid carcinoma 57 . CORT is an endogenous cyclic neuropeptide that can regulate the growth and metastasis of lung cancer and thyroid cancer 58,59 , and it also regulates the inflammatory response by inhibiting the immune infiltration 60 . Granulin a (GRNA) is a 6 kDa peptide hydrolyzed from PGRN, which can effectively inhibit the growth and invasion of human hepatoma cells 61 . The high expression of VAV1 is a positive prognostic factor for early invasive breast cancer 62 . Zong et al. 63 found that the overexpression of SDC3 can significantly inhibit the proliferation and metastasis of mesenchymal tumor cells. Wu et al. 64 found that miR20a-5p promotes the proliferation, migration, and invasion of head and neck squamous cell carcinoma by down-regulating TNFRSF21. Another study found that TNFRSF21 also plays an important role in regulating leukocyte infiltration 65 . obviously, the results of our analysis are consistent with the results of previous studies, which further confirms that this signature has a high value for the prognosis of OS.
To date, a lot of OS prognostic molecules have been found, including MALAT1 9 , UCA1 10 and miR191 11 . Most of these were based on single-gene prognosis studies. Existing studies have found that the occurrence and development of tumors are not caused by changes in single genes, but are the result of a series of gene changes 66 . In addition, the use of single genes cannot avoid the differences caused by individual heterogeneity. Most importantly, these studies did not use large samples to fully explore the relationship between genes and the prognosis of OS. In this study, 13 prognostic IRGs were identified by univariate cox regression and iterative LASSO cox regression analysis for the risk stratification of OS patients. Extensive analyses proved that this prognostic signature has a higher diagnostic value than pre-existing models. Recently, Shi et al. 67 also constructed a prognostic signature that consisted of three DEGs (MYC, CPE, and LY86) in OS. However, the DEGs in their study came from metastatic and non-metastatic patients and lacked a normal control sample. Therefore, the gene included in the signature did not reflect the pathological characteristics of the occurrence and development of OS. Our signature was verified by an independent verification set, which has a high diagnostic efficiency compared to that with other biomarkers. However, our research also has some unavoidable limitations and deficiencies. First, in the study, we used normal muscle tissue as a control group. Therefore, compared with normal bone tissue, there may be a certain difference in the expression of IRGs. In addition, due to the lack of protein expression profile data for OS, we used gene expression profile data, which may not fully reflect the biological characteristics of

Conclusion
In summary, we developed an IRGs signature that is a prognostic indicator in OS patients, and further verified it in an independent cohort. Hence, the signature might serve as potential prognostic indicator to identify outcome of OS and facilitate personalized management of the high-risk patients.    www.nature.com/scientificreports/ database created and maintained by NCBI, which contains high-throughput gene expression data and gene chip expression data submitted by research institutions around the world. We downloaded the gene expression profiles and the corresponding clinical data of OS from the TARGET database, including 88 OS samples, and obtained the normal muscle tissue gene expression profile data set from the GTEx database as a control group, including 396 muscle tissue samples. Then we applied the R software (Version 3.3.3, https ://www.r-proje ct.org/) sva package 69 to merge the raw data (CEL files) of the two sets. Subsequently, we used the Limma package 70 to screen DEGs between the OS tissue and normal muscle tissue. The cut-off value was | log 2 fold change (log 2 FC) |> 1 and adj. p < 0.05. We downloaded and organized the IRGs list from the ImmPort (https ://immpo rt.niaid .niaid .gov) database, selected DEIRGs from DEGs and used them for our analysis.
Functional correlation analysis of DEIRGs. GO is a tool for annotating genes and their products, which aid the integration and utilization of biological data 71 . KEGG is a database integrating genomics, chemistry, and system function information, which provides currently known biological metabolic signaling pathways 72 . The clusterProfiler package 73 was used to perform GO and KEGG enrichment analysis on DEIRGs; p < 0.05 was used as a cut-off value for significant gene enrichment. The Search Tool for the Retrieval of Interacting Genes online tool (STRING, https ://www.strin g-db.org/, Version: 11.0) 74 and Cytoscape software 75 were used to construct the PPI network for DEIRGs, and the hub IRGs were screened using the cytoHubba plug-in 76 . The hub IRGs selection criteria shortlisted the top 10 DEGs through the maximum correlation standard algorithm. Based on the semantic similarity of GO terms, GOSemSim package 77 was used to compute closeness of the relationship between the molecular function and cell localization among 10 hub IRGs, and used the average functional similarity to rank the 10 hub IRGs 78 . The results were visualized by the ggplot2 package 79 .
Identification and assessment of the prognostic signature. To develop the optimal signature for predicting OS prognosis based on IRGs, we performed univariate Cox regression analysis on the obtained DEIRGs, and selected IRGs related to prognosis with a screening criterion of p < 0.05. Next, we used the glmnet (https ://CRAN.R-proje ct.org/packa ge=glmne t) package 80 to perform a machine learning algorithm-iterative LASSO Cox regression analysis on prognostic-associated IRGs to construct the optimal prognosis signature. LASSO is highly dependent on seeds and requires cross-validation to select samples randomly. Once the seeds are replaced, the optimal lambda and resulting features change. Iterative LASSO regression was used to select high-frequency features, such as consensus genes, according to the frequency sequence of features after several runs of LASSO. Then, the consensus genes were sequentially included in the Cox model. After the AUC of ROC reached a peak, the genes were not included. At this point, the model is optimal and contains the least features 81 . We counted the consensus genes for which the frequency exceeded 50 after 500 LASSO regressions. Then we fit the expression levels of the consensus genes into a variable through the iterative LASSO cox regression to construct the optimal prognosis signature of OS. Next, we scored each sample with the optimal signature and divided the patients into a high-or low-risk group, according to the median of the score. Finally, we used R software to draw a risk factor association chart to display the survival status.
Comparison of signature with other known prognostic biomarkers and verification in an independent cohort. Many prognostic markers for patients with OS have been previously determined. SP140 has been identified as a promising prognostic marker for OS patients 8 , and the expression of MALAT1 has been shown to be associated with a worse prognosis for OS patients 9 . UCA1 expression may be an independent prognostic indicator for predicting a poor prognosis in patients with OS 10 . In addition, miR191 is highly expressed in the serum of patients with osteosarcoma and is positively correlated with clinical stage 11 . In order to determine whether our signature has a better ability to predict patient survival than known biomarkers, we conducted a ROC comparative analysis of the signature and other biomarkers. Good prognostic markers should also have a high predictive prognostic performance in other independent cohorts. To test the utility of the signature in this study, we verified it with another independent cohort (GSE39055). Details of the GSE39055 dataset are shown in Supplementary Table 1.

Subgroup survival analysis, signature clinical value evaluation. An important feature of a good
prognostic marker is that it should be independent of the currently used clinicopathological prognostic factors. To evaluate the independence and applicability of this signature, we regrouped OS patients according to different clinicopathological characteristics, and then performed Kaplan-Meier survival analysis for their subgroups. We performed univariate and multivariate Cox regressions on clinicopathological characteristics and the signature to evaluate whether the signature is an important prognostic factor.
Correlation analysis of prognostic signature and clinical characteristics. To further evaluate the correlation between the risk score based on the prognosis-associated IRGs signature and clinical characteristics, we classified patients according to age, sex, and distant metastatic status. Then we used the ggstatsplot (https :// githu b.com/Indra jeetP atil/ggsta tsplo t) package to analyze the correlation between the risk score and the aforementioned. The results are shown in the ggplot2 package.