Prognostic significance of TCF21 mRNA expression in patients with lung adenocarcinoma

Several prognostic indicators have shown inconsistencies in patients of different genders with lung adenocarcinoma, indicating that these variations may be due to the different genetic background of males and females with lung adenocarcinoma. In this study, we first used the Gene-Cloud of Biotechnology Information (GCBI) bioinformatics platform to identify differentially expressed genes (DEGs) that eliminated gender differences between lung adenocarcinoma and normal lung tissues. Then, we screened out that transcription factor 21 (TCF21) is a hub gene among these DEGs by creating a gene co-expression network on the GCBI platform. Furthermore, we used the comprehensive survival analysis platforms Kaplan-Meier plotter and PrognoScan to assess the prognostic value of TCF21 expression in lung adenocarcinoma patients. Finally, we concluded that decreased mRNA expression of TCF21 is a predictor for poor prognosis in patients with lung adenocarcinoma.

TCF21 is also the core gene in the DEGs obtained from GSE10072. To verify the results obtained from GSE40791, we then used the same process in GCBI to study the DEGs between lung adenocarcinoma and normal lung tissues from GSE10072. We identified 76 potential DEGs from males ( Fig. 4a and Supplementary  Table S7) and 87 potential DEGs from females ( Fig. 4b and Supplementary Table S8). To remove duplicate expressions from the same gene and expressions from unannotated genes, 52 DEGs (Supplementary Table S9) and 65 DEGs (Supplementary Table S10) were selected from the male and female groups, respectively. By taking the intersection of these two DEGs datasets, 46 co-expressed DEGs that eliminated gender differences were found (Fig. 4c). Among them, 16 DEGs were incorporated into the gene co-expression network (Fig. 5a). As shown in Table 2 and Fig. 5b, we again found that TCF21 has the most gene connections in the gene co-expression network. Consequently, we confirmed that TCF21 is the core gene in the DEGs between lung adenocarcinoma and normal lung tissues.

Discussion
To assess the prognostic value of gene expression for lung cancer patient survival, the expression of the genes was normally detected by immunohistochemistry in protein level [20][21][22] . However, the expression level of mRNA detected by polymerase chain reaction also performed reliable prognostic value for lung cancer patients 23,24 . In this study, we extracted gene expression array data from GEO datasets using the GCBI bioinformatics analysis platform and further assessed these data using the comprehensive survival analysis platform Kaplan-Meier plotter and PrognoScan. We finally found that decreased TCF21 mRNA expression is an unfavorable prognostic factor for patients with lung adenocarcinoma. Our study indicated that comprehensive utilization of bioinformatics analyses is a good strategy to study prognosis of lung adenocarcinoma and may be useful to assess prognosis for other cancer types.
Previous studies did not consider the effect of gender differences in DEGs between lung and normal tissues at the design stage. Because of the different genetic background between males and females, we considered that lung adenocarcinomas with different gender origins may also have significant genetic difference. In our study, Figure 2. DEGs between lung adenocarcinoma and normal lung tissues for GSE40791. (a) Heat map for potential DEGs (n = 321) from males (lung adenocarcinoma n = 53, in blue; normal lung tissues n = 58, in yellow). (b) Heat map for potential DEGs (n = 463) from females (lung adenocarcinoma n = 41, in blue; normal lung tissues n = 42, in yellow). (c) Duplicate expressions with the same gene symbol and those without a specific gene symbol were removed; the Venn diagram shows DEGs (n = 223) from males, DEGs (n = 330) from females and the co-expressed genes (n = 211) that eliminated gender differences.
to eliminate the impact of gender differences, we divided both lung adenocarcinoma and non-neoplastic lung samples into two groups according to gender. Using comprehensive bioinformatics analyses, we reported that the decreased mRNA expression of TCF21 is an unfavorable prognostic factor for patients with lung adenocarcinoma. To our knowledge, this is the first study to investigate the prognostic value of TCF21 in patients with this specific lung cancer type. TCF21 (also known as Capsulin, Pod-1 or Epicardin) is a member of the basic-helix-loop-helix transcription factor family [25][26][27] . During embryogenesis, TCF21 is crucial for the development of a number of cell types in the spleen 28 , heart 29 , kidney 30 and lung 30 . However, TCF21 is also important in cancer development. In vitro, it can affect tumor cell cycle balance 31 and suppress cancer cell proliferation, migration 32 and invasion 32,33 . Additionally, mouse model experiments showed that TCF21 can significantly reduce tumor growth in vivo 34,35 . Consequently, TCF21 is considered to be a potential tumor suppressor 34,36 .
Unfortunately, TCF21 expression is commonly deregulated by promoter hypermethylation in different types of cancer 32,33,[37][38][39][40] . This phenomenon is detrimental for cancer patients. Study reported that TCF21 promoter methylation is correlated with decreased survival in patients with metastatic skin melanoma 33 . Low expression of TCF21 is an independent prognostic factor for poor survival in patients with clear cell renal cell carcinoma 41 or gastric cancer 42 .
For lung cancer, previous studies demonstrated that aberrant TCF21 promoter methylation existed in 9 of 10 lung cancer cell lines 39 and the majority of lung cancer tissues (70-81%) 34,38,39 . Richards KL and colleagues 38 reported that 84% of non-small cell lung cancer samples showed decreased TCF21 protein expression. TCF21 expression was not correlated with gender and was lower in adenocarcinoma than in squamous cell carcinoma 38 , indicating that TCF21 is a crucial tumor suppressor for lung adenocarcinoma in both males and females. We corroborated these findings in the present study. Recently, Wu H and colleagues 35 found that low TCF21 expression  is associated with poor survival for lung cancer using the survival analysis platform Kaplan-Meier plotter. In our study, we comprehensively used both Kaplan-Meier plotter and PrognoScan to first conclude that decreased mRNA expression of TCF21 is an unfavorable prognostic factor in lung adenocarcinoma patients without gender difference and that TCF21 expression showed no prognostic value in patients with lung squamous cell carcinoma. We consider that maybe TCF21 is a specific prognostic factor in lung adenocarcinoma rather than in lung squamous cell carcinoma, for which this point has not taken into account in previous studies. We know that TCF21 is a crucial tumor suppressor, and TCF21 expression is commonly deregulated by aberrant promoter methylation in different types of cancer. Additionally, decreased expression of TCF21 is a poor prognostic factor for cancer patients. However, we know little about the tumor suppressor mechanism of TCF21. Therefore, in the future, more studies should be conducted to elucidate the mechanisms of TCF21 tumor suppressor function. In addition, the bioinformatics analysis platforms of GCBI, Kaplan-Meier plotter and PrognoScan used in this study were based on the method of unsupervised analysis of gene expression profiles (named the "one-step-clustering" approach by Li J et al. 43 ). Because of the extremely high variability in gene expression profiles between individual tumors and because "passenger signals" may mask the "real" cancer gene signals, our results derived from this "one-step-clustering" approach may be less robust and accurate 43,44 .
In summary, we first used the GCBI bioinformatics analysis platform to identify DEGs that eliminated gender differences between lung adenocarcinoma and normal lung tissues, which showed that TCF21 is the hub gene. Then, using the comprehensive survival analysis platforms of Kaplan-Meier plotter and PrognoScan, we concluded that decreased mRNA expression of TCF21 is a poor prognostic factor in patients with lung adenocarcinoma. cn) is a platform that combines a variety of research findings, genetic information, sample information, data algorithms and bioinformatics to create a "gene knowledge base, " which encompasses biology, medicine, informatics, computer science, mathematics, graphics and other disciplines. GCBI includes more than 120 million copies of genomic samples, approximately 90,000 copies of tumor samples and more than 17 million copies of genetic information. Therefore, GCBI is a good bioinformatics analysis platform and has provided data analysis support for many studies on cancer research [45][46][47][48][49] . In this study, we used GCBI to identify DEGs between lung adenocarcinoma and normal lung tissues and finally screened out that TCF21 is the hub gene in the DEG co-expression network. In the Differential Gene Expression Analysis module on the GCBI platform (Supplementary File), we chose a fold expression change >5 at cutoff values Q < 0.05 and P < 0.05 to screen out DEGs. Then, we selected the Gene Co-expression Network module on the GCBI platform to create a gene co-expression network for the DEGs that eliminated gender differences.
Gene Expression Omnibus (GEO) DataSets. GEO DataSets (https://www.ncbi.nlm.nih.gov/gds) is the public repository for storing high throughput gene expression datasets at National Center of Biotechnology Information. In this study, we selected datasets according to the following inclusion criteria: (1) Human lung cancer specimens containing the pathological type of adenocarcinoma; (2) Normal lung tissues used as the controls; (3) Specimens had gender information; (4) Number of samples no less than 100; (5) Expression profiling by     datasets, this online tool is suitable for in silico validation of new biomarkers related to survival for patients with non-small cell lung cancer 50 . We used the Kaplan-Meier plotter to assess the prognostic value of TCF21 expression in patients with lung adenocarcinoma or squamous cell carcinoma.
PrognoScan. PrognoScan (http://www.prognoscan.org/) is a comprehensive platform for evaluating potential tumor biomarkers and therapeutic targets. Based on a large collection of cancer microarray datasets with clinical annotation, PrognoScan is a useful online tool for assessing the association between specific gene expression and prognosis in patients with cancer 51 . We used the PrognoScan platform to validate the prognostic value of TCF21 expression in patients with lung adenocarcinoma or squamous cell carcinoma.