Comprehensive analysis of KCTD family genes associated with hypoxic microenvironment and immune infiltration in lung adenocarcinoma

To obtain novel insights into the tumor biology and therapeutic targets of LUAD, we performed a comprehensive analysis of the KCTD family genes. The expression patterns and clinical significance of the KCTD family were identified through multiple bioinformatics mining. Moreover, the molecular functions and potential mechanisms of differentially expressed KCTDs were evaluated using TIMER 2.0, cBioPortal, GeneMANIA, LinkedOmics and GSEA. The results indicated that the mRNA and protein expression levels of KCTD9, KCTD10, KCTD12, KCTD15 and KCTD16 were significantly decreased in LUAD, while those of KCTD5 were significantly increased. High KCTD5 expression was significantly associated with advanced tumor stage, lymph node metastasis, TP53 mutation and poor prognosis. In addition, KCTD5 was positively correlated with CD8 + T cell, neutrophil, macrophage and dendritic cell infiltration. Additionally, KCTDs demonstrate promising prospects in the diagnosis of LUAD. Importantly, high KCTD5 expression was enriched in signaling pathways associated with the malignant progression of tumors, including the inflammatory response, the IL6/JAK/STAT3 signaling pathway, EMT and hypoxia. Further association analysis showed that KCTD5 was positively correlated with hypoxia-related genes such as HIF1. Overall, KCTDs can be used as molecular targets for the treatment of LUAD, as well as effective molecular biomarkers for diagnosis and prognosis prediction.

www.nature.com/scientificreports/ of KCTDs and immune infiltration, as well as KCTDs genetic alterations, were analyzed in LUAD. Finally, gene set enrichment analysis (GSEA) and correlation analysis were used to preliminarily explore the molecular mechanisms of KCTDs regulating LUAD.

Materials and methods
Database and public platform. The Cancer Genome Atlas (TCGA) database integrates the gene expression and corresponding clinicopathological feature data of a variety of cancer patients (http:// cance rgeno me. nih. gov) 9 . In this study, we extracted the mRNA expression profile (RNA-Seq, level 3) and clinicopathological data of lung cancer from TCGA database for ROC analysis and GSEA. UALCAN is a convenient public tumor database (http:// ualcan. path. uab. edu) 10 . In this study, the mRNA expression profile data of LUAD from the UALCAN and TCGA databases were used for differential expression analysis, including normal samples and LUAD samples with different tumor stages, lymph node metastasis statuses, and TP53 mutation statuses. We used TPM (transcripts per million) as the measure of expression. The LUAD protein expression profile provided by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) dataset were used to confirm the protein expression of KCTD family genes 11 . Total protein expression levels of KCTD family genes were compared between LUAD samples and normal samples.
Gene Expression Profiling Interactive Analysis 2 (GEPIA2) is a website for the analysis and visualization of tumor and normal expression data (http:// gepia2. cancer-pku. cn/) 12 . In this study, we used the GEPIA online tool to analyze the differential expression between LUAD and normal (match TCGA normal and GTEx data) samples and to perform correlation analysis of KCTD5 and hypoxia-related genes. The expression data were first transformed into log 2 (TPM + 1) values for differential analysis. The parameters are as follows, differential methods: ANOVA, |log 2 FC| cutoff: 1, q-value cutoff: 0.01. The Pearson correlation coefficient was used to describe the degree of correlation between two genes. We also used GSCA (Gene Set Cancer Analysis) online database (http:// bioin fo. life. hust. edu. cn/ GSCA/#/) to evaluate the expression differences between tumor and normal samples of CENP family genes in LUAD 13 .
Kaplan-Meier Plotter was used to evaluate the effect of KCTD gene expression on overall survival (OS) in patients with LUAD (http:// kmplot. com/ analy sis/) 14,15 . The patients were divided into a high expression group and a low expression group based on the median expression value. PrognoScan (http:// dna00. bio. kyute ch. ac. jp/ Progn oScan/ index. html) is a major survival analysis database with data mainly from Gene Expression Omnibus (GEO). We also used this database to assess the relationship between KCTDs expression and patient outcomes in lung cancer 16 . TIMER 2.0 is a comprehensive database that systematically analyzes and visualizes the immune infiltration of multiple tumors (http:// timer. cistr ome. org/) 17 . In this study, TIMER 2.0 was used to evaluate the association between KCTD gene expression and immune infiltrates in LUAD. The Spearman correlation coefficient was used.
LinkedOmics is an online database that includes multi-omics data from all 32 TCGA cancer types and 10 CPTAC cancer cohorts (http:// linke domics. org/ login. php) 21 . The co-expression of KCTD family genes was analyzed using RNAseq data from TCGA-LUAD database. Pearson correlation test was used to evaluate the correlation of co-expressed genes. In the "LinkFinder" module, the volcano plots and heat maps show genes that are positively/negatively associated with target genes.
GSEA. GSEA v4.1.0 (http:// www. gsea-msigdb. org/) was used to identify the possible mechanism by which KCTD5 regulates LUAD 22 . In this study, GSEA analysis was based on mRNA expression data of TCGA-LUAD. The Hallmark gene set file "h.all.v7.4.symbols.gmt" (MSigDB) was selected as the gene set database 22 . The number of permutations was set to 1000. Significance criteria were nominal P-value < 0.05 and false positive rate (FDR) < 0.05. ROC and statistical analysis. SPSS version19.0 was used for statistical analysis. Comparison of the mRNA expression and protein expression were performed using Student's t-test (UALCAN), one-way ANOVA test (GEPIA2), Wilcoxon test (TIMER2.0), or Kruskal-Wallis test (LinkedOmics). Survival analysis was conducted through Kaplan-Meier Plotter with the log-rank test. Pearson test was used to assess the correlation between the expression of any two genes. The correlation between gene expression and immune infiltration was analyzed by Spearman's method. Logistic regression analysis and ROC analysis were applied to construct a tumor diagnostic model. The area under the curve (AUC) was calculated for each ROC curve. P < 0.05 was considered statistically significant. Besides, FDR < 0.05 was an additional criterion for GSEA.

Results
Expression pattern of KCTD family genes in patients with LUAD. To identify the KCTD family genes that are differentially expressed in LUAD, differential expression analysis was performed (Student's t-test). We extracted the mRNA expression profile data of LUAD from the TCGA database to detect the mRNA expression levels of KCTD family genes. Compared with normal tissues, the expression levels of KCTD2 (P < 0.05), www.nature.com/scientificreports/ KCTD9 (P < 0.001), KCTD10 (P < 0.001), KCTD12 (P < 0.001), KCTD15 (P < 0.001) and KCTD16 (P < 0.001) were significantly decreased in LUAD. In contrast, the expression levels of KCTD5 (P < 0.001) and KCTD21 (P < 0.001) were significantly higher in LUAD than in normal tissues (Fig. 1A). Subsequently, we verified the mRNA expression of KCTD family genes in LUAD using the GEPIA2 database. The experimental results were consistent with those obtained from the TCGA database (Fig. 1B). The expression of KCTD family genes in LUAD was further confirmed in the GSCA database. KCTD5 and KCTD21 were significantly (FDR < 0.05) up-regulated in LUAD, while the expression levels of KCTD2, KCTD9, KCTD10, KCTD12, KCTD15 and KCTD16 were significantly (FDR < 0.05) lower than those in normal tissues ( Supplementary Fig. 1).
Importantly, we further explored the protein expression of KCTD family genes in LUAD using the CPTAC database. The results suggested that KCTD9 (P < 0.001), KCTD10 (P < 0.001), KCTD12 (P < 0.001), KCTD15 (P < 0.001), KCTD16 (P < 0.001), and KCTD21 (P < 0.001) had low expression in LUAD, while the expression level of KCTD2 (P < 0.05) and KCTD5 (P < 0.001) in LUAD were higher than that in normal tissues (Fig. 2). We identified the expression patterns of KCTD family genes in LUAD at the mRNA level and the protein level. www.nature.com/scientificreports/ Surprisingly, we found that the expression of KCTD2 is inconsistent at the transcription level and translation level, the same situation also occurs on KCTD21.

Correlation between the expression levels of KCTD family genes and the clinicopathological characteristics of LUAD patients. The correlation analysis between expression levels of KCTD family
genes and clinicopathological characteristics (tumor stage, lymph node metastasis and TP53 mutation status) was performed (Student's t-test). Compared with normal tissues, the expression levels of KCTD2, KCTD9, KCTD10, KCTD12, KCTD15 and KCTD16 in stage 1, stage 2, stage 3, and stage 4 were significantly reduced. The expression levels of KCTD5 and KCTD21 in stage 1, stage 2, stage 3, and stage 4 were significantly increased ( Fig. 3A). Similarly, the expression levels of KCTD2, KCTD9, KCTD10, KCTD12, KCTD15, and KCTD16 were significantly attenuated in N0, N1, N2, and N3 compared with those in normal tissues. Whereas, the expression levels of KCTD5 and KCTD21 were significantly enhanced in N0, N1, N2, and N3 (Fig. 3B). These results preliminarily suggest that KCTD family genes are related to lymph node metastasis of LUAD.
Finally, we analyzed the correlation between KCTD family genes and TP53 mutations (Fig. 3C). The experimental results demonstrated that the expression levels of KCTD2, KCTD5 and KCTD21 were significantly different between the TP53-Mutant and TP53-Nonmutant groups, suggesting that these genes were related to the mutation status of TP53.

Prognostic roles of KCTD family genes in patients with LUAD. The Kaplan-Meier Plotter database
was used to identify the prognostic value of KCTD family genes in LUAD (Fig. 4A, Table 1). The OS of the patients was used as an indicator. The results showed that there was a significant difference in OS between the high expression group and the low expression group of KCTD2 (P < 0.05), KCTD5 (P < 0.001), KCTD9 (P < 0.001), KCTD10 (P < 0.001), KCTD12 (P < 0.001), KCTD16 (P < 0.01) and KCTD21 (P < 0.001). Among them, the OS of the high expression group of KCTD2, KCTD9, KCTD10, KCTD12 and KCTD16 was longer than that of the low expression group, and the high expression of these genes suggested a better prognosis in LUAD patients. However, the OS of the group with high KCTD5 expression (63.40 months) was significantly shorter than that of the group with low KCTD5 expression (136.33 months). Furthermore, the PrognoScan survival analysis demonstrated that the high and low expression of KCTD2 (HR = 0.13, Cox P = 0.010), KCTD5 (HR = 4.71, Cox P = 0.004), KCTD9 (HR = 0.50, Cox P = 0.032), KCTD12 (HR = 0.55, Cox P = 0.013) and KCTD16 (HR = 0.56, Cox P < 0.001) were significantly correlated with the OS of lung cancer patients (Fig. 4B). Similarly, the OS of KCTD5 low expression group was significantly longer than that of KCTD5 high expression group (Fig. 4B). These results suggest that KCTD5 may be associated with the poor prognosis of LUAD and can be used as a prognostic marker.

Association of KCTD family genes with immune infiltration level in LUAD.
We further investigated the correlation of KCTD expression with the immune infiltration levels in LUAD. As shown in Fig. 5, KCTD10 was related to all six immune infiltrates. KCTD2 and KCTD12 were positively correlated with CD4 + T cells, CD8 + T cells, neutrophils, macrophages and dendritic cells but had no correlation with B cells. All the differentially expressed KCTDs were positively correlated with macrophages. Except for KCTD15, all the other differentially expressed KCTDs were positively correlated with CD8 + T cells.
Genetic alteration and interaction analyses of KCTDs in LUAD. Subsequently, the genetic alterations of KCTD family members in LUAD were analyzed using the cBioPortal database (Fig. 6A). Among a series    www.nature.com/scientificreports/ www.nature.com/scientificreports/ KCTDs co-expression networks in LUAD. We explored the KCTDs co-expression networks in LUAD using LinkedOmics. As shown in the volcano plot (Fig. 7A), 5223 genes (red dots) were positively correlated with KCTD2, and 4768 genes (green dots) were negatively associated with KCTD2 (P < 0.05); 4846 genes were positively related with KCTD5, and 5622 genes were negatively correlated with KCTD5 (P < 0.05); 4077 genes were positively correlated with KCTD9, and 4741 genes were negatively associated with KCTD9 (P < 0.05); 6480 genes were positively correlated with KCTD10, and 4545 genes were negatively associated with KCTD10 (P < 0.05); 7509 genes were positively related with KCTD12, and 5458 genes were negatively correlated with KCTD12 (P < 0.05); 4987 genes were positively correlated with KCTD15, and 2512 genes were negatively associated with KCTD15 (P < 0.05); 7047 genes were positively related with KCTD16, and 3479 genes were negatively correlated with KCTD16 (P < 0.05); 2749 genes were positively correlated with KCTD21, and 5126 genes were negatively

Diagnostic features of KCTD family genes in patients with LUAD.
To identify the diagnostic value of KCTD family genes in LUAD, we performed logistic regression analysis and ROC analysis. As shown in Table 2 (Table 3).
To exclude the influence of clinicopathological parameters on the performance of KCTD family genes, we further built three prediction models, namely Model 1 (including only target genes), Model 2 (including target genes  www.nature.com/scientificreports/ and clinicopathological parameters) and Model 3 (excluding target genes) ( Table 4, Fig. 8B-E). In this part, we focused on genes with AUCs greater than 0.8 (KCTD12, KCTD10, KCTD16, and KCTD5). The AUCs of Model 1 and Model 2 of the KCTD12 gene were very close (AUC Model 1 = 0.924, P < 0.001; AUC Model 2 = 0.926, P < 0.001), both greater than 0.9, while the AUC of Model 3 was less than 0.6 (AUC Model 3 = 0.585, P < 0.05) (Fig. 8B). Similar results were also found for KCTD10 (Fig. 8C), KCTD16 (Fig. 8D), and KCTD5 (Fig. 8E), and the AUCs of Model 1 and Model 2 were close to and significantly greater than that of Model 3. The results suggested that these four target genes play a decisive role in the prediction model and can be used as independent diagnostic factors.
Potential mechanism of the oncogenic effect of KCTD5 in LUAD. To further explore the potential mechanism of the regulation of KCTD family genes in LUAD, we performed GSEA and correlation analysis. Based on previous studies, we focused on KCTD5, which is highly expressed in LUAD and has bright prospects in prognosis prediction and diagnosis and may be used as a therapeutic target for LUAD. High KCTD5 expression was enriched in many signaling pathways associated with the malignant progression of tumors, including the inflammatory response (P < 0.001, FDR < 0.001, NES = 2.29), IL6/JAK/STAT3 signaling pathway (P < 0.001, Table 3. Logistic regression analysis: KCTD family genes and clinicopathological characteristics as covariates.  (Fig. 9A). It is well known that the hypoxic microenvironment is a crucial factor leading to tumor metastasis. We further tested the association between KCTD5 and hypoxiarelated genes, and the results showed that KCTD5 was positively correlated with HIF1 (P < 0.001, R = 0.34), VEGF (P < 0.001, R = 0.33), ALDOA (P < 0.001, R = 0.42), ENO1 (P < 0.001, R = 0.37), LDHA (P < 0.001, R = 0.22), P4HA1 (P < 0.001, R = 0.24), PGAM1 (P < 0.001, R = 0.38), and TUBB6 (P < 0.001, R = 0.30) (Fig. 9B). These results reminder that KCTD5 may promote the metastasis of LUAD by regulating the hypoxic microenvironment, and the specific molecular mechanism needs to be further studied. KCTD5 may be a new therapeutic target for LUAD.

Discussion
The study focused on LUAD because it is the predominant subtype of non-small cell lung cancer (NSCLC) 23 . LUAD is a heterogeneous disease with a variety of somatic mutations, including some driver mutations, such as KRAS, EGFR, and ALK 24 . Molecular-targeted therapies against driver gene mutations have dramatically changed the treatment strategies for LUAD 25 . However, the treatment of LUAD also faces great challenges, such as drug resistance and tumor metastasis. Thus, it is necessary to identify novel, stable and reliable molecular markers for early screening and targeted therapy. Previous studies have shown that KCTD family genes are involved in the regulation of tumorigenesis. KCTD15 inhibits the Hedgehog signaling pathway by increasing the stability of the tumor suppressor KCASH2, thereby inhibiting the proliferation of medulloblastoma cells 26 28 . Based on early high-throughput microarray screening, we comprehensively explored the expression patterns, diagnostic value, prognostic value, immune infiltration levels, genetic mutation status, and enrichment of downstream signaling pathways of KCTDs in LUAD.
In the present study, the expression of KCTD family genes in LUAD was detected from the transcriptional level and protein expression level, and a series of differentially expressed KCTDs were identified. Surprisingly, we found that the expression of KCTD2 is inconsistent at the transcription level and translation level, the same situation also occurs on KCTD21. As we all know, gene expression is generally regulated by both transcription and translation, and the correlation between mRNA and corresponding protein expression depends on many regulatory factors and metabolic processes. From RNA to protein and then to phenotype is a complex and delicate process. After the formation of mRNA, regulatory activities such as post-transcriptional regulation, epigenetic modification and post-translational regulation are likely to occur, which are important factors leading to the disconformity of mRNA expression trend with its protein level. In addition, background noise from sequencing was also one of the reasons for the low mRNA-protein correlation. Subsequently, we analyzed the correlation between KCTDs gene expression level and clinicopathologic features, and we found that the expression of KCTD5 and KCTD21 was significantly higher in the lymph node metastasis group than in the normal group. This finding suggests that KCTD5 and KCTD21 may be involved in the regulation of tumor metastasis. Importantly, GSEA showed that high KCTD5 expression was positively correlated with EMT. It is well known that EMT plays an important role in tumor metastasis, malignant progression, tumor stemness and anti-apoptosis 29,30 . www.nature.com/scientificreports/ Interestingly, we also found that KCTD5 positively regulates tumor hypoxia signaling pathways. Further association analysis showed that KCTD5 was positively correlated with hypoxia-related genes such as hypoxiainducible factor 1 (HIF1) and vascular endothelial growth factor (VEGF). Hypoxia is a widespread phenomenon in most solid tumors and is closely related to tumor proliferation, differentiation, angiogenesis, energy metabolism and drug resistance 31,32 . At present, hypoxia is becoming recognized as an important driver of EMT in cancer 33,34 . KCTD5 is a member of the KCTD protein family, and its role in lung cancer has not yet been reported. Thus, we propose a new question: does KCTD5 mediate tumor metastasis by regulating hypoxia-induced EMT? In the future, we will further explore the molecular function and regulatory mechanism of KCTD5 in LUAD to provide a new molecular target for the diagnosis and treatment of LUAD, as well as a new mechanism explanation for the metastasis of LUAD.
In conclusion, our study identified the expression pattern, clinical application value, immune infiltration and molecular function of KCTD family genes in LUAD, and KCTD5 as an important prognostic biomarker may be involved in the regulation of tumor progression mediated by hypoxic microenvironment (Fig. 10). KCTD family genes, especially KCTD5, may serve as new therapeutic targets for LUAD. www.nature.com/scientificreports/