Identification and validation of a cigarette smoke-related five-gene signature as a prognostic biomarker in kidney renal clear cell carcinoma

Cigarette smoking greatly promotes the progression of kidney renal clear cell carcinoma (KIRC), however, the underlying molecular events has not been fully established. In this study, RCC cells were exposed to the tobacco specific nitrosamine 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK, nicotine-derived nitrosamine) for 120 days (40 passages), and then the soft agar colony formation, wound healing and transwell assays were used to explore characteristics of RCC cells. RNA-seq was used to explore differentially expressed genes. We found that NNK promoted RCC cell growth and migration in a dose-dependent manner, and RNA-seq explored 14 differentially expressed genes. In TCGA-KIRC cohort, Lasso regression and multivariate COX regression models screened and constructed a five-gene signature containing ANKRD1, CYB5A, ECHDC3, MT1E, and AKT1S1. This novel gene signature significantly associated with TNM stage, invasion depth, metastasis, and tumor grade. Moreover, when compared with individual genes, the gene signature contained a higher hazard ratio and therefore had a more powerful value for the prognosis of KIRC. A nomogram was also developed based on clinical features and the gene signature, which showed good application. Finally, AKT1S1, the most crucial component of the gene signature, was significantly induced after NNK exposure and its related AKT/mTOR signaling pathway was dramatically activated. Our findings supported that NNK exposure would promote the KIRC progression, and the novel cigarette smoke-related five-gene signature might serve as a highly efficient biomarker to identify progression of KIRC patients, AKT1S1 might play an important role in cigarette smoke exposure-induced KIRC progression.


Results
NNK exposure increased growth and migration abilities of RCC cells. The human RCC cell lines 786-O and KETR-3 were continuously exposed to 0.1% DMSO and NNK (0.01 and 0.1 μM) for 120 days (40 passages). Then the results of soft agar colony formation assay revealed that the number of cell colonies had a significant dose-dependent increase after NNK exposure when compared with the 0.1% DMSO control group ( Fig. 1A and B). The wound healing assay was performed and showed that long-term NNK exposure significantly promoted KETR-3 and 786-O cell migration ability (Fig. 1C-F). The cell transwell assay also was performed and observed that the number of cell migration was significantly increased in a dose-dependent manner after long-term NNK exposure in both KETR-3 and 786-O cells when compared with the respective 0.1% DMSO controls ( Fig. 1G and H).
Expression of fourteen cigarette smoke exposure-related genes in KIRC and normal kidney tissues in TCGA-KIRC dataset. To explore the roles of these fourteen genes in KIRC, we evaluated their expression patterns in TCGA-KIRC cohort. As shown in Fig. 3, we found that three cigarette smoke exposurereduced genes DCN, ECHDC3, MT1E and two cigarette smoke exposure-induced genes AKT1S1, TEN1 were significantly down-regulated and up-regulated in KIRC when compared with the kidney normal tissues, respectively. Studies have reported that active smoking is associated with histological KIRC subtype 3,4 . Here we found that ECHDC3 expression in KIRC was continuously reduced, while AKT1S1 expression was gradually increased with the elevated malignancy of pathological grades ( Supplementary Fig. S1). However, four cigarette smoke exposure-reduced genes ANGPTL4, TGM2, TICAM2, ZNF579, and one cigarette smoke exposure-promoted gene MAPK14 were significantly up-regulated and down-regulated in KIRC when compared with the kidney normal tissues, respectively (Fig. 3). Furthermore, TGM2 and TICAM2 expression in KIRC were gradually increased while MAPK14 expression was gradually decreased with the elevated malignancy of pathological grades ( Supplementary Fig. S1). In addition, MAGEB2 had very low expression levels in KIRC and kidney normal tissues.
Construction and prognostic value of cigarette smoke exposure-related gene signature in TCGA-KIRC cohort. Individual KIRC tissue recently has been identified to have the substantial intratumour heterogeneity, demonstrating that single gene are unlikely to reveal a complete status of KIRC progression 15 . In addition, studies have found that the gene signature will be better than a single gene to judge prognosis of a variety of tumors [16][17][18] . Therefore, a single gene was not sufficiently comprehensive and efficient to evaluate the contribution of cigarette smoking to KIRC progression, in this study, the cigarette smoke exposure-related The soft agar colony formation of KETR-3 and 786-O cells exposed to 0 (0.1% DMSO), 0.01, 0.1 μM NNK at passage 40 for 120 days, and the number of cell colonies was counted (n = 3/group). (C-F) Cell wound healing assays in KETR-3 and 786-O cells exposed to 0 (0.1% DMSO), 0.01, 0.1 μM NNK at passage 40. The width of cell wound healing was measured (n = 3/group). (G and H) The migration of KETR-3 and 786-O cells exposed to 0 (0.1% DMSO), 0.01, 0.1 μM NNK at passage 40 (magnification × 100), and the relative number of cell migration per field was showed (n = 3/group). Data were presented as means ± standard deviations * P < 0.05, ** P < 0.001. www.nature.com/scientificreports/ gene signature was produced through integrating multiple candidate genes. Lasso regression analysis was firstly used to avoid over-fitting problems in the gene signature, and five cigarette smoke exposure-related candidate genes ANKRD12, CYB5A, ECHDC3, MT1E, and AKT1S1 were retained when the optimal λ value was achieved (Supplementary Fig. S2A and B). Finally, a cigarette smoke exposure-related five-gene signature was established using the multivariate COX regression model and was digitized into a risk score based on the sum of the product of risk coefficient of each gene and the relevant mRNA expression level (Table 1). We examined the correlation of the risk score with patients' clinicopathlogical characteritics, and found that the risk score was significantly higher in advanced TNM stage (III/IV), invasion depth T3/4, lymphatic node metastasis N1, distant metastasis M1 and low pathological grade groups (G3/G4) when compared with early TNM (I/II), T1/2, N0, M0 and high pathological grade groups (G1/G2) (Fig. 5A). The time-dependent ROC curve was used to identify predictive value for KIRC patients' survival and revealed that the risk score had a larger area under the curve (AUC) than individual genes (Fig. 5B).   www.nature.com/scientificreports/ Next, patients were classified to low-risk or high-risk group based on the median threshold of the risk score to further explore the prognostic value of the risk score in KIRC ( Supplementary Fig. S2C). We found that the number of deaths was significantly much more in the high-risk group than the low-risk group, and survival time of the death sample significantly decreased with the decreasing risk score (Supplementary Fig. S2D). The Kaplan-Meier curve showed that the survival time of patients with high-risk score was significantly shorter than the time of patients with lower-risk score (HR = 2.12, 95%CI: 1.54-2.90) (Fig. 5C). In addition, patients with high-risk score was a significantly adverse prognostic indicator in both early and advanced TNM stages (Supplementary Fig. S3). Multivariate analysis showed that the risk score was an independent prognostic indicator (HR = 1.7, 95%CI: 1.25-2.30) after adjusting with age, sex, tumor TNM stage and grade (Fig. 5D).
Nomogram construction and validation. Based on the multivariate COX proportional regression model (Fig. 5D), the prognostic nomogram was constructed to quantitatively predict the individualized prognostic risk for 1-, 3-, and 5-year overall survival by integrating cigarette smoke exposure-related gene signature risk scores with baseline variables (age and gender) and other independent clinical variables (grade and TNM stage). Each variable was assigned a corresponding point value based on its risk contribution to this model (Fig. 6A). Finally, the calibration curves suggested the agreement between the actual and predicted overall survival. The calibration curve showed that the 1-, 3-, and 5-year overall survival predicted by the nomograms were consistent with actual observations (Fig. 6B-D), indicating that the nomograms performed well.
NNK exposure promoted AKT1S1 expression and activated AKT/mTOR signaling pathways. In this study, we found that AKT1S1 played as the most important component in the cigarette smoke exposure-related gene signature; and studies have showed that AKT1S1 acts an critical role in the intersection of the AKT/mTOR signaling pathways 19 , therefore, we explored AKT1S1 expression level in Ketr-3 and 786-O cells exposed to 0.1% DMSO and 0.01, 0.1 μM NNK. We found that NNK exposure dramatically up-regulated mRNA levels of AKT1S1 (Fig. 7A), and AKT1S1, p-AKT, p-mTOR protein levels had significant increase after NNK exposure when compared with the 0.1% DMSO control group (Fig. 7B-C). In addition, in order to detect the function and pathways of AKT1S1 in NNK-promoted KIRC progression, we constructed the protein interaction network (PPI network) with the String database (https:// string-db. org/) (Supplementary Fig. S4). The results of showed that AKT1S1 was enriched in mTOR signaling pathway with the lowest false discovery rate value in the biological process of Gene Ontology (GO) analysis, and in autophagy pathway with the lowest false discovery rate value in the KEGG analysis.

Discussion
It's well-known that tumor cells infinite growth and metastasis are the crucial characters of tumor malignant progression 20 . Epidemiological data have indicated that cigarette smoking is associated with the tumor malignant progression and the poor prognosis of KIRC 21,22 . In this study, we found that long-term exposure to the major component of cigarette smoke, nicotine-derived NNK, increased the abilities of RCC cells colony formation and migration, which were key events of tumor growth and metastasis. Based on these findings, it was not surprising to see that cigarette smoke enhanced the malignant phenotypes of tumor cells to eventually promote KIRC progression.
There are multiple molecular events that cigarette smoke initiates and promotes the malignancy of RCC. Recently the relevance of cigarette smoke carcinogens with the inactivation of tumor suppressor genes or the activation of oncogenes has been validated for the development and progression of cancer 23,24 . Thus, further exploration of the molecular events involved in NNK-induced malignancy of RCC cells might provide new biomarkers for progression of KIRC. Here, we performed the genome-wide sequencing to seek potential biomarkers and found fourteen cigarette smoke exposure-related genes showing NNK dose-dependent down-regulation or up-regulation. Moreover, the real time PCR validated the reliability of RNA-seq.
Therefore, we reasonably speculated that cigarette smoke exposure-related genes have broad prospects in progression evaluation of KIRC. Through compared the gene expression patterns in KIRC and normal kidney tissues, we found nine genes were not inconsistent with the findings in RCC cells with NNK exposure. Next, the survival analysis indicated that cigarette smoke exposure-reduced genes ANKRD12, CYB5A, ECHDC3, DCN, HOXC10 and cigarette smoke exposure-promoted gene AKT1S1 showed the positive and negative relationship with overall survival in KIRC, which was accord with some studies that decreased ANKRD12 and CYB5A and increased AKT1S1 expression show a higher frequency of tumor metastasis and are indicators of increased risk of tumor progression [25][26][27] . Though cigarette smoke exposure-reduced DCN was found to be negative prognostic  www.nature.com/scientificreports/ factor in KIRC, increasing evidences indicate that lack of DCN expression has been regarded as an indicator of tumor metastasis 28 . However, cancer heterogeneity leads to unsatisfactory effects of individual genes on the progression judgment in KIRC patients. Therefore, new efforts are urgently required to develop comprehensive estimate for KIRC. Studies have found that the gene signature will be better than a single gene to judge prognosis of a variety of tumors [16][17][18] . In this study, Lasso regression was used to screen variables to establish the prognostic model to avoid extreme prediction. The new cigarette smoke-related five-gene signature was established using the multivariate COX regression model. To provide a clinically quantitative method for gene signature, we produced a risk score based on risk coefficient of each gene and the relevant mRNA expression level. This scoring approach and its cut-off value have been confirmed to be robust in some cancer-related studies, which may be readily translated to clinical practice [29][30][31] .
Our data showed that the gene signature with high-risk score was significantly associated with the increased tumor invasion depth, lymphatic node metastasis, and distant metastasis and advanced TNM stage. Using the time-dependent ROC curve, we found that the risk score had a better predictive value than individual genes in KIRC prognosis. More importantly, the risk score significantly stratified patient outcomes and high-risk score was a significantly more unfavorable factor for KIRC prognosis than any single gene, indicating that the risk score had a stronger prognostic power than single genes.
Considering that AKT1S1 was the most important component of the cigarette smoke exposure-related gene signature, here significantly up-regulated AKT1S1 expression was observed after NNK exposure, which was accord with some reports that elevated AKT1S1 expression in cancer cells and could contribute to tumor metastasis 27,32 . Studies have showed that AKT1S1 is involve in regulating cell growth, cell apoptosis, oxidative stress, autophagy and angiogenesis through various of signaling pathways such as AKT, mTOR, NF-κB and et al.; AKT1S1 phosphorylation state could predict hyperactivation of the AKT/mTOR pathway in multiple cancer cell types 19,33 . In this study, we found that NNK exposure activated AKT/mTOR signaling pathway. And the protein interaction network (PPI network) showed that the function and pathways of AKT1S1 were mainly enriched in mTOR and autophagy pathway. In addition, studies have identified AKT/mTOR signaling as important dysregulated pathways in KIRC, and some mTOR targeted inhibitors, such as everolimus and temsirolimus, have been validated to contribute to better clinical outcome of metastatic renal cell carcinoma [34][35][36] . Therefore, our data suggested that the up-regulation of AKT1S1, AKT/mTOR signaling pathway and autophagy might play an important role in cigarette smoke-induced KIRC metastasis and progression ( Supplementary Fig. S5).
In summary, NNK exposure promoted the growth and migration abilities of RCC cells. Using RNA-seq, fourteen cigarette smoke exposure-related genes were obtained. The expression patterns showed that nine genes in KIRC when compared with normal kidney tissues were not inconsistent with the findings in RCC cells with NNK exposure, and their prognostic value were further analyzed. Five cigarette smoke-related gene signature was screened and integrated by Lasso regression analysis and multivariate COX regression model. The gene signature was more powerful than any signal gene for predicting the prognosis of KIRC patients. Moreover, NNK exposure-induced AKT1S1 and its related AKT/mTOR signaling pathway might play an important role in cigarette smoking-induced KIRC progression. Therefore, our findings provided a significant mechanistic insight into cigarette smoke-induced KIRC progression and supported that the cigarette smoke-related gene signature might serve as a highly efficient biomarker to identify metastasis and prognosis of KIRC patients.
However, a limitation is the lack of animal models of NNK-promoted KIRC progression and the validation of gene signature expression in NNK-promoted tumor tissues of animal models. In addition, this study is underpowered to assess the role of AKT1S1 and its related AKT/mTOR signaling pathway in NNK exposure-induced KIRC progression unless AKT1S1 inhibitors were used in NNK-stimulated cell and animal models. Soft agar colony formation assay. The 6-well plates were firstly coated with 0.60% agarose. Then 500 cells per well were plated in triplicate in 1 ml of 0.35% agarose over 0.60% agarose. Cultures were fed every 3 days. At 14 days, the 0.5% NBT was used to dye the colonies. Colonies which were dyed strongly brown were scored as "positive" and colony-forming number in each well was counted by Image J software.

Methods
Wound healing assay. Cells were grown to 80% confluence into a 6-well plate in complete medium overnight and converted to serum-free medium for another 12 h at 37 °C and 5% CO 2 . An injury line was made using a 2-mm-wide plastic pipette tip. Then the wells were rinsed with phosphate-buffered saline and covered with serum-free medium, and the photographs were acquired at 0 h and 24 h, respectively. Then the scratch width of every group at 0 h and 24 h were measured, and migration distance was calculated through subtracting the scratch width at 24 h from the scratch width of 0 h. Finally the relative ratio of migrating distance in NNK groups and DMSO group was calculated.
Transwell assay. The transwell filter inserts with a pore size of 8 μm were used for the cell migration assay. www.nature.com/scientificreports/ and placed in 24-well plate containing 500 μl complete medium. After 12 h incubation at 37 °C, cells in the upper chamber were carefully removed with a cotton swab and the cells that had traversed the membrane were fixed in methanol, stained with crystal violet (0.04% in water; 100 μl). Then these inserts were placed under the inverted microscope (100 ×), and five fields of each insert were photographed. The crystal violet positive permeating cells of each field were counted by Image J software, and the relative ratio of migrating cells number in NNK groups and DMSO group was calculated.
Transcriptome resequencing and quantitative analysis. Human transcriptome resequencing (Vazyme, China) was used to analyze gene expressions collected from 786-O cells which were exposed 0.1%DMSO, 0.01 μM, 0.1 μM NNK at 40 passages for 120 days. The Cufflinks (cufflinks-2.2.1) was used to perform the quantitative analysis of gene expression.

Western blot analysis.
Western blot was carried out as previously reported 37  The anti-GAPDH was used for the protein loading control. The antigen-antibody complex was detected by an enhanced chemiluminescence system. All the blots were cut prior to hybridization with antibodies during blotting. Moreover, we checked these same molecules in 786-O, Ketr-3 and ACHN cells, and the results showed that the same molecule was displayed at the same location on the membrane and all the molecular weights were consistent with the antibody specifications. In addition, there were few or almost no non-specific bands in all blots. In the supplementary Fig. S6, we provided the images of all blots as they are, with membrane edges visible, all the experiments were repeated three times.