Genome-wide analyses of long non-coding RNA expression profiles and functional network analysis in esophageal squamous cell carcinoma

Esophageal cancer (EC) is a serious malignancy and that is the fifth leading cause of cancer-related death worldwide. Esophageal squamous cell carcinoma (ESCC) is the main subtype of EC in China. In recent years, long non-coding RNAs (lncRNAs) have demonstrated to be novel tumor-associated regulatory factors. However, the functions and mechanisms of lncRNAs in ESCC have not been fully understood. In this study, we attempted to construct Genome-wide expression profiles of lncRNAs and their potential functions in ESCC. By using microarray, we found a total of 2,366 lncRNAs (1,032 upregulated and 1,334 downregulated) and 3,052 mRNAs (1,477 upregulated and 1,575 downregulated) were differentially expressed between the paired five ESCC tumor tissues and adjacent normal esophageal tissues (fold change, FC ≥2.0 or ≤0.5, p ≤ 0.05). Eight lncRNAs were detected by qRT-PCR to verify the results of the microarray, and the clinicopathological parameters were analyzed in 53 patients with ESCC. GO analysis and KEGG pathway analysis showed that the main biological functions of these abnormal lncRNAs were related to immune response, extracellular vesicular exosome, and protein binding. At the same time, the cis and trans models were used to analyze the potential synergistic regulatory relationship between lncRNAs and their potential target genes. Related genes were the processes that affect cell growth, differentiation, and migration. Then we mapped the lncRNAs-mRNAs co-expression pattern by calculating the PCCs of each lncRNA and mRNA expression value. Furthermore, we investigated the function and potential mechanism of a novel highly expressed lncRNA, lnc-KIAA1244-2, and found that its expression is associated with tumor size, N classification and clinical stage. Knockdown of lnc-KIAA1244-2 inhibited the cell proliferation and inhibited the TNFAIP3 expression in Eca-109 cells. Taken together, the expression patterns of lncRNAs and mRNAs in ESCC tumor tissues are different from those in normal adjacent tissues, and some abnormal expressed lncRNAs may play important roles in the development and progression of ESCC. Lnc-KIAA1244-2 could promote the cell proliferation of ESCC cells and might be a potent therapeutic target for ESCC.

www.nature.com/scientificreports www.nature.com/scientificreports/ has not been identified well 6 . Therefore, it is required to identify specific and reliable biomarkers involved in development and progression of ESCC to predict the clinical outcome of the tumor.
In recent years, non-coding RNA (ncRNA) has been demonstrated to be involved in the progression of disease, including cancer 7,8 . Increasing studies have focused on long non-coding RNAs (lncRNAs), which are greater than 200 nt and are unable to be translated into proteins [9][10][11] . Studies have shown that lncRNAs are involved in a variety of biological processes and play important roles in tumorigenesis and tumor progression [12][13][14][15] .
There is increasing evidence that lncRNAs may act as a tumor suppressor or promoter in various tumors and may be a new class of tumor biomarkers and therapeutic targets 12 . For instance, lncRNA MIR31HG is upregulated in ESCC tumor tissues and plasma, which is related to TNM stage and lymph node metastasis. Knockdown of MIR31HG can inhibit the proliferation, migration, invasion, and inhibited the expression of Furin and MMP1 in EC9706 and EC1 cells. It can be used as biomarkers for diagnosis or prediction of ESCC 16 . The discovery of lncRNAs provides a new strategy to understand the mechanism of tumor occurrence and development.
In the present study, we screened the differential expression profiles of lncRNAs in ESCC tissues by microarray and analyzed differentially expressed lncRNAs function and potential related acting molecules to further understand the role of lncRNAs in the occurrence and development of ESCC. In addition, we showed that lnc-KIAA1244-2 was upregulated in ESCC and could promote the proliferation of ESCC. Lnc-KIAA1244-2 might be a potent therapeutic target for ESCC.

Expression profiles of lncRNAs in ESCC.
To construct expression profiles of lncRNAs in ESCC, we used microarray to detect the expression of lncRNAs, as well as mRNAs in five pairs of ESCC tumor tissues and adjacent normal tissues. The expression of lncRNAs and mRNAs in ESCC tumor tissues was significantly different from that of the adjacent normal esophageal tissues. Principal component analysis (Fig. 1A) was applied to investigate the distribution of samples and verify the rationality of the experimental design. The samples in the same group were concentrated in two-dimensional space, indicating that these genes were selected and the biological repeats were good. The box plot was used to quickly compare the distribution of lncRNAs, and after normalization, the distributions of log 2 ratios among the tested samples were almost similar (Fig. 1B). Volcano plots were used to outline the misalignment of lncRNAs and mRNAs in datasets (Fig. 1C,D). The lncRNAs expression profile (Fig. 1E) and mRNAs expression profile (Fig. 1F) were established and clustered using hierarchical cluster analysis. Based on the microarray data, 3,052 lncRNAs and 2,366 mRNAs were identified to be differentially expression (FC ≥2.0 or ≤0.5, p ≤ 0.05). Among those, 1,032 and 1,334 lncRNAs were upregulated and downregulated, respectively; 1,477 and 1,575 mRNAs were upregulated and downregulated, respectively. The top 30 upregulated and downregulated lncRNAs and mRNAs are listed in Table 1. The most upregulated lncRNA and mRNA were lnc-MMP10-3 (absolute FC = 1469.9) and SPP1 (absolute FC = 774.8), respectively. The most downregulated lncRNA and mRNA were lnc-FLG2-2 (absolute FC = 817.0) and CRISP3 (absolute FC = 1701.3), respectively.
Potential function recognition of the differential expression lncRNAs in ESCC. We further analyzed the differential expression of lncRNAs between ESCC tumor tissues and normal tissues by GO analysis and KEGG pathway analysis. The top 500 terms in the GO terms list were highly enriched for immune response, mitotic cell cycle, cellular lipid metabolic process (ontology: biological process), extracellular vesicular exosome, nucleosome, extracellular matrix (ontology: cellular component) and protein binding, and immunoglobulin G binding (ontology: molecular function). The top 500 terms in the KEGG pathway were associated with valine, leucine and isoleucine degradation, systemic lupus erythematosus, extracellular matrix receptor interaction, alcoholism, and phagosome. The GO and KEGG pathway analyses are shown in Fig. 3A-D.
To explore the relationship between differentially expressed lncRNA and target mRNA in ESCC for each differentiated lncRNA, we calculated the enrichment of functional terms of co-expressed genes in top 400 differentially expressed lncRNAs. The highest up-regulated lncRNA was lnc-MMP10-3 (absolute FC = 1469.95) among ESCC tissues versus paired non-carcinoma tissues and a total of 809 genes (e.g. S100A16, OLFML2B, and CAST) were related to lnc-MMP10-3 as the standard of absolute PCCs value > 0.8. The top 30 co-expressed genes of lnc-MMP10-3 are listed in Table S2 (see Supplementary). Furthermore, GO and KEGG pathways were applied to annotate the lnc-MMP10-3 co-expressed mRNA function. The top 30 reliably predicted terms of GO analysis are listed in Table S3 (see Supplementary). It shows that significantly enriched GO terms were involved in fatty-acyl-CoA binding, G1/S transition of mitotic cell cycle, Golgi membrane, MAPK cascade, and transfer RNA-intron endonuclease activity. Moreover, KEGG pathways analysis results, including fatty acid elongation, primary bile acid biosynthesis, valine, leucine, and isoleucine degradation, glutathione metabolism, and amino sugar and nucleotide sugar metabolism, are listed in Table S4 (see Supplementary).

Knockdown of lnc-KIAA1244-2 inhibited the cell proliferation and inhibited the TNFAIP3 expression in Eca-109 cells.
To further investigate the functions and mechanisms of lncRNAs in ESCC, one of the upregulated lncRNAs, lnc-KIAA1244-2, was identified for further investigation in ESCC cells. As shown in Fig. 6A, the expression levels of lnc-KIAA1244-2 in the ESCC cell lines were higher than that in the normal esophageal epithelial cell line Het-1 A and was prominently upregulated in the Eca-109 cell lines. www.nature.com/scientificreports www.nature.com/scientificreports/ Furthermore, short hairpin RNA (shRNA) mediated by a lentiviral vector (LV) was used to knockdown lnc-KIAA1244-2 in Eca-109 cells. As shown in Fig. 6B, the expression level of lnc-KIAA1244-2 was significantly downregulated in the shRNA-LV groups compared with the expression level in the control-LV group.
The CCK-8 assays and cell cycle assays indicated that the knockdown of lnc-KIAA1244-2 observably inhibited the proliferation of Eca-109 cells (Fig. 6C). Cell cycle analysis demonstrated that lnc-KIAA1244-2 knockdown led to cell cycle arrest at G0/G1 phase (Fig. 6D). These results indicated that lnc-KIAA1244-2 promoted the proliferation of ESCC cells. In addition, the mRNA level and protein level of TNFAIP3 were detected in lnc-KIAA1244-2 knockdown Eca-109 cells, since TNFAIP3 was predicted to be the target of lnc-KIAA1244-2. The results in Fig. 6E,F showed that both the TNFAIP3 mRNA and protein levels were downregulated after lnc-KIAA1244-2 knockdown in Eca-109 cells, indicating lnc-KIAA1244-2 might regulate ESCC cell proliferation via mediating TNFAIP3.

Discussion
In the past 50 years, researchers have concentrated on the role of mRNA in disease processes. However, the Human Genome Project found that fewer than 2% of the genomes encode proteins, more than 98% of the genomes transcribe ncRNAs, and most ncRNAs were considered to be transcribed to lncRNAs 17 . In view of the important functional roles revealed by small ncRNAs, these lncRNAs have received increasing attention in recent years in their possible functions 6 . Some ESCC-related lncRNAs have been identified, such as lncRNA SBF2-AS1, AK001796, and MIR31HG, and can be used as biomarkers for the diagnosis or prediction of ESCC 16,18,19 . However, the amount of identified lncRNAs is only a small fraction, and the function of a large number of lncR-NAs requires further investigation.  www.nature.com/scientificreports www.nature.com/scientificreports/ In this study, we screened the genome-wide expression profiles of lncRNAs and mRNAs in ESCC and found that the expression of lncRNAs and mRNAs in ESCC tumor tissues were significantly different from that of the adjacent normal esophageal tissues. It indicates that lncRNAs are abnormally expressed in ESCC. We randomly selected eight dysregulated lncRNAs (lnc-MMP1-2, lnc-ABCA12-3, lnc-PTPN7-3, lnc-KIAA1244-2, lnc-SLC25A24-1, lnc-ARL4A-4, lnc-FBXL2-4 and lnc-SNRNP27-1) to verify microarray data by qRT-PCR. The results were consistent with the results of the microarray data, indicating that the microarray data are credible.
Then we analyzed the target genes of the top 400 differentially expressed lncRNAs based on co-expressed analysis and predicted the function of the top 500 lncRNAs using GO analysis and KEGG pathway. Most of these potential cis-regulatory target genes, including cell growth, differentiation, and migration, are associated with tumorigenesis and tumor progression. The lncRNA expression profiles overlapping our current study and previous studies confirmed the reliability of this study. For example, SOX2OT is highly expressed in ESCC tumor www.nature.com/scientificreports www.nature.com/scientificreports/ tissues 20,21 . In our data, SOX2OT expression was upregulated with an average expression FC of 2.0. Our data also suggested that cis-regulated target genes were involved in the initiation and progression of ESCC. For instance, FN1 is highly expressed in a variety of tumor tissues including ESCC, which can promote tumor metastasis and invasion [22][23][24] . In our data, FN1 was found to be highly expressed in ESCC tissues (FC = 11.5).
LncRNAs can interact directly with gene promoters and TFs. These lncRNAs enhance and regulate the activity of promoters by recruiting protein factors 25 . LncRNAs control the transcription process through their interaction with primary coding transcripts. The differential expression lncRNAs may be mostly regulated by the 5 TFs (E2F4, STAT2, STAT3, KAT2A, STAT1). E2F4 is a cyclical protein and a transcription factor with transcriptional inhibition. STAT1 and STAT2 are associated with the regulation of cell apoptosis. STAT3 as an oncogene is associated with the regulation of cell proliferation 26 . KAT2A is an acetyltransferase that alters the structure of DNA by transferring acetyl groups. At the same time, some studies have shown that KAT2A plays an important role in regulating the function of stem cells, and can promote the normal development of embryos 27,28 , thereby enhancing the transcriptional level of specific DNA and selectively producing various traits.
In addition, we investigated the association between expressions of the validated eight lncRNAs and clinicopathological parameters in ESCC. The results showed that multiple lncRNAs were associated with clinicopathological parameters. We found that lnc-KIAA1244-2 expression was positively associated with tumor size, N classification and clinical stage. Thus, the function and mechanism of lnc-KIAA1244-2 were further studied. Lnc-KIAA1244-2 is 234 nucleotides in length and is located on chromosome 6q23.3. An important role of lncRNAs is to regulate the expression of adjacent protein-coding genes through cis regulatory model 13 . The results of cis-regulation showed that the genes encoding TNFAIP3 and lnc-KIAA1244-2 are located at the same locus. It was previously reported that TNFAIP3 is involved in proliferation, migration and invasion of EC cells 29 .
By knockdown assays, we demonstrated that lnc-KIAA1244-2 could promote cell proliferation and regulate TNFAIP3 expression. Therefore, we speculate that lnc-KIAA1244-2 promotes the progression of ESCC by regulating TNFAIP3. Lnc-KIAA1244-2 might be a biomarker of diagnosis and prognosis and a therapy target of ESCC. However, functions and mechanisms of lnc-KIAA1244-2 need further study.
In conclusion, this study demonstrates that the expression patterns of lncRNAs and mRNAs in ESCC tumor tissues are different from those in normal adjacent tissues, and some abnormal expression of lncRNAs may play important roles in the development and progression of ESCC. The profiles we established in ESCC will provide a foundation for further study on the mechanism of differentially expressed lncRNA in ESCC and their clinical significance. These lncRNAs identified in this study are promising biomarkers for diagnosis, classification, prognosis, and therapeutic evaluation in patients with ESCC.  integrity was measured by NanoDrop ND-2000 (Fisher Thermo Scientific) and its integrity was evaluated by Agilent Bioanalyzer 2100 (Agilent Technologies). The samples were labeled, hybridized, and washed depending on the standard protocols of the manufacturer. In short, the total RNA is transcribed into double-stranded complementary DNA (cDNA) and then synthesized into antisense RNA (aRNA). The second cycle of cDNAs were synthesized from aRNA, then fragmented and biotinylated followed by microarray hybridization. After washing and staining, the arrays were scanned by Affymetrix Scanner 3000 (Affymetrix).
Quantitative real-time polymerase chain reaction validation. The results of the microarray were verified by quantitative real-time polymerase chain reaction (qRT-PCR). Total RNA was extracted by Trizol reagent (Ambion) and reverse transcribed by GoScript TM reverse transcription system (GeneCopoeia). QRT-PCR was implemented using All-in-One TM qPCR Mix (GeneCopoeia). GAPDH was used as an internal reference gene. Primers for qRT-PCR analysis of lncRNAs are listed in Table S1 (see Supplementary). The lncRNA expression levels between ESCC tumor tissues and adjacent normal tissues were compared using paired t test.
Bioinformatics analysis. The extraction of raw data was based on Affymetrix Gene Chip Command Console (version 4, Affymetrix) software. RMA standardization was provided for gene and exon level analysis using Expression Console (version 1.3.1, Affymetrix) software. Gene expression analysis was performed using (2019) 9:9162 | https://doi.org/10.1038/s41598-019-45493-5 www.nature.com/scientificreports www.nature.com/scientificreports/ GeneSpring software (version 14.8, Agilent Technologies). Differentially expressed genes were identified by fold change (FC) and P values calculated with t tests. The threshold value for upregulated and downregulated genes was a FC ≥2.0 and p ≤ 0.05. Subsequently, Gene Ontology (GO: http://www.geneontology.org) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG: http://www.me.jp/kegg) analysis were used to determine the role of these differentially expressed lncRNAs. Finally, the differentially expressed lncRNAs and mRNAs in the samples were described by hierarchical clustering. Differential expression analysis. Differentially expressed lncRNAs and mRNAs were identified by paired t test (FC ≥2.0 or ≤0.5, p < 0.05 and FDR <0.05). The microarray data have been uploaded in NCBI Gene Expression Omnibus (GEO) and the GEO accession number is GSE120356 (http://www.ncbi.nlm.nih.gov/geo/ query/acc.cgi?acc = GSE120356).
prediction of lncRNAs function. The positive correlation between lncRNAs and mRNAs was expressed as an absolute value of Pearson correlation ≥0.7 and p ≤ 0.05. The enrichment of co-expressed mRNAs was calculated by hypergeometric cumulative distribution function. Then we selected top 100 upregulated and downregulated lncRNAs and searched for co-expressed genes within the 100-kb window of each lncRNA (p ≤ 0.05). The genes on both sides of lncRNA were potential cis regulatory genes. A determination needed to be made as to which genes may be trans regulated by lncRNAs. The JEMBOSS software was used to determine which TFs might interact with lncRNAs. For each abnormal expression of lncRNA, coded genes were calculated and the significance of gene enrichment in each TF entry was calculated by hyper-geometric distribution test. Small P values indicate that the differential expression of the gene was enriched in the TF entry. We used Cytoscape software to draw TFs, mRNAs, and lncRNAs relational networks.
Cell culture and culture conditions. The normal human esophageal epithelial-1 cell (Het-1A) and ESCC cell lines (Eca-109, KYSE-510, TE-1, KYSE-70, TE-11 and KYSE-150) were cultured in an incubator (Thermo Fisher Scientific, Waltham, MA, USA) that contained 5% CO2 and maintained at 37 °C. Culture medium contained penicillin-streptomycin, RPMI 1640 medium (Basal Media, China) and 10% fetal bovine serum (FBS; Biological Industries, Israel). Western blotting analysis and antibodies. RIPA buffer (MultiSciences, Hangzhou, China) was used to extract the total protein. After the protein extracts were separated on 10% SDS-PAGE and electro transferred onto polyvinylidene difluoride (PVDF) membranes, the PVDF membranes were blocked with QuickBlock ™ Blocking Buffer (Beyotime Biotechnology, Shanghai, China) for 20 min. Then, the membrane was immunostained with antibodies (anti-beta-actin antibody, Affinity; TNFAIP3 antibody, Novus) overnight at 4 °C. Subsequently, 1:5000 goat anti-rabbit or goat anti-mouse secondary antibodies were used for culturing 1 hour at room temperature. The bands were measured using a MiniChemi ™ Imaging system and ImageJ software. statistical analysis. Data analysis using SPSS software (version 19.0, Chicago, IL). The categorical data were analyzed by a chi-square test or Fisher exact tests, and displayed by cross tables. All experiments in the present study were repeated at least 3 times, and the data collected from 3 independent experiments are presented as the mean ± SD. The differences between the groups were assessed by Student's t-test. All tests were two-sided. P-values < 0.05 were considered statistically significant.

Lentivirus-Mediated Small
Ethics approval and consent to participate. Ethical approval for this study was obtained from the Institutional Ethics Committee of Hunan Cancer Hospital and with the 1964 Helsinki declaration and its later amendments.

Data Availability
The datasets used during the current study are available from the corresponding author on reasonable request.