Overexpression of LINC00152 correlates with poor patient survival and knockdown impairs cell proliferation in lung cancer

We employed RNA sequencing analysis to reveal dysregulated lncRNAs in lung cancer utilizing 461 lung adenocarcinomas and 156 normal lung tissues from 3 separate cohorts. We found that LINC00152 was highly overexpressed in lung tumors as compared to their adjacent normal tissues. Patients with high LINC00152 expression demonstrate a significantly poorer survival than those with low expression. We verified the diagnostic/prognostic potential of LINC00152 expression in an independent cohort of lung tumor tissues using quantitative RT-PCR. After knockdown of LINC00152 using siRNAs in lung cancer cell lines, both cell proliferation and colony formation were decreased. Cell fractionation and qRT-PCR analysis indicated that LINC00152 is found mainly in the cytoplasm. Treatment with Trichostatin A in cell lines having low LINC00152 expression indicated that histone acetylation may be one mechanism underlying LINC00152 overexpression in NSCLC. Western blot analyses indicated that p38a, STAT1, STAT3, CREB1, CCNE1 and c-MYC proteins were decreased after LINC00152 siRNA treatment. Our study indicates LINC00152 plays an important role in lung tumor growth and is potentially a diagnostic/prognostic marker. Further characterization of LINC00152 in regulating its target proteins may provide a novel therapeutic target of lung cancer.

It is estimated that 595,690 Americans will die from cancer in 2016, and more than one-quarter of these (158,080) will be due to lung cancer. Lung cancer continues to be the number one cause of cancer related death in both men and women worldwide 1 . While recent advances using screening CT scans are diagnosing early disease more frequently, the 5-year relative survival is still low (18%). These low survival rates are partly due to the fact that one-half of cases are diagnosed at a higher stage, for which 5-year survival is only 4% 1, 2 . While multiple molecular events converge to trigger unregulated growth, invasion, and metastasis in lung cancer, the exact mechanisms are not fully understood. Thus there is an urgent need for identification of markers that may aid in the early diagnosis or stratification of lung cancer as well as new therapeutic targets.
Accumulated evidence showed that more than 70% of the human genome is transcribed into primary RNA, but only about 2% encodes for peptide products, with the remainder being noncoding RNAs (ncRNAs) 3,4 . These ncRNAs can be divided into two groups based on their transcript lengths: small ncRNAs, which are shorter than 200 bp, and long ncRNAs (lncRNAs), which are longer than 200 bp 5 . The lncRNAs are usually expressed in a tissue-specific pattern and show a low level of expression and demonstrate low sequence conservation as compared to protein coding RNAs. Through gene expression microarrays and RNA sequencing analysis, hundreds of lncRNAs have been reported to be dysregulated in lung cancer [6][7][8] ; however, only a few have been well characterized regarding their functional role in cancer 9 . LncRNAs are thought to drive many important cancer phenotypes and disease-related pathways such as controlling cellular proliferation, invasion, development, lineage commitment, immune response, pluripotency and differentiation [10][11][12][13] . The cellular localization of lncRNAs may determine their function roles, e.g. nuclear lncRNAs are enriched for functions involving chromatin interaction, transcriptional regulation and RNA processing, while cytoplasmic lncRNAs can modulate mRNA stability or translation and influence cellular signaling cascades 14 .
LINC00152 has been linked to several human cancers and promotes cell proliferation in gastric and hepatocellular carcinoma (HCC) 15,16 . Additionally, it may also act as a potential prognostic biomarker and therapeutic target in colorectal cancer, clear cell renal carcinoma and HCC [17][18][19] . Moreover, since plasma levels of LINC00152 are significantly elevated in patients with gastric cancer, this lncRNA has the potential to serve as a blood-based biomarker for this disease 20 . The expression of LINC00152 and its functional roles in lung cancer, however, are unexplored. In this study, through analysis of RNA-Seq data from a large cohort of lung cancers, we demonstrate that LINC00152 is highly expressed in lung cancer, and associated with poor patient survival. We validate its expression in an independent cohort of primary lung cancer using RT-PCR and explored its oncogenic functions in lung cancer cell lines, as well as the possible molecular mechanisms involved in lung cancer.

Results
Increased LINC00152 expression is correlated with worse prognosis in patient with lung ADs. To identify the dysregulated lncRNAs and their diagnostic potential in lung adenocarcinomas (LUAD), the largest subtype of NSCLC, we performed Receiver Operating Characteristic (ROC) curve analysis with combined RNA-Seq data from our cohort (UM 67 LUADs and 6 normal lung tissues), and two other independent published LUAD cohorts, Seo (85 LUADs and 85 normal lung tissues) 21 and TCGA (309 LUADs and 73 normal lung tissues) 22 . We have identified lncRNAs differentially expressed in LUADs as compared to normal lung tissues 6 . Among the most highly overexpressed lncRNAs, LINC00152 was found to be significantly increased in LUADs. Scatterplots showed that LINC00152 maintained a high expression level in tumors (vs. normal) in the three cohorts ( Fig. 1A-C). The area under the curve (AUC) values from ROC analysis were larger than 0.74 in all 3 cohorts (Fig. 1D-F) indicating that LINC00152 may be potentially used as a novel diagnostic marker for this type of lung cancer. We next evaluated the association of LINC00152 and patient survival in two independently published LUAD microarray data sets where survival information was available, Okayama et al. (226 LUADs, stage 1 and 2) 23 and TCGA (197 LUADs, stage 1 to 3) 22 . Kaplan-Meier survival curves and log-rank tests showed that higher expression levels of LINC00152 was significantly correlated with poor patient outcome in the Okayama (p = 0.001) and TCGA data sets (p = 0.03) (Fig. 1G,H), whereas patients with relatively lower levels of LINC00152 expression showed better survival. Taken together, LINC00152 was significantly overexpressed in LUAD across multiple studies and overexpression was a predictor of poor patient survival in LUADs.

Validation of LINC00152 expression pattern in an independent cohort of LUADs by qRT-PCR.
To verify the LINC00152 expression pattern discovered from RNA-Seq and microarray data sets, we performed qRT-PCR using mRNA from an independent cohort from UM including 101 LUADs and 27 normal lung tissues. The results showed that LINC00152 expression levels were significantly higher in LUAD as compared to normal lung tissues (p < 0.001) ( Fig. 2A). The AUC was 0.88 indicated LINC00152 expression levels could classify the tumors from normal lung (Fig. 2B). Kaplan-Meier survival curve and the log-rank test indicated that higher expression of LINC00152 was significantly related to worse patient survival (p = 0.003) (Fig. 2C). We also analyzed LINC00152 expression levels and other clinical variables from this validation set, however, no evidence was obtained to support the association between LINC00152 expression and age, gender, smoking, differentiation, tumor stage, lymph node or KRAS mutation status (Supplementary Table 1).
Overexpression of LINC00152 was reported in colorectal cancer, clear cell renal carcinoma and HCC; however to determine if LINC00152 expression was higher in other cancers, we analyzed RNA-Seq expression data including 6,220 cancers from the MiTranscriptome database 24 . We found that LINC00152 was increased in most cancers including bladder, breast, gastric, head/neck, kidney, liver, thyroid, cervix, blood, uterus, and colon cancers (FRKM log2 value > 1.4) but not prostate cancer (FRKM log2 = 0.67) (Fig. 2D). Interestingly, we found that squamous cell (LUSC) and large cell (LULC) lung carcinomas also highly expressed LINC00152 (vs. normal). These results indicated that LINC00152 was highly expressed not only in lung cancer but also in other type of cancers and could be potentially useful as a general cancer marker. LINC00152 overexpression caused by histone acetylation. Gene expression could be influenced by genomic copy number changes, or specific transcription factors coupled with changes in histone and DNA modifications in its gene promoter. We examined DNA copy number changes of the LINC00152 genomic locus by Affymetrix SNP 6.0 arrays on 90 lung adenocarcinomas and 10 normal lung tissues (unpublished data) and did not find any apparent amplification except for small gains in two cases (Fig. 3A). To test whether LINC00152 expression is related to histone acetylation and promoter DNA methylation, we first analyzed its expression levels in 33 lung cancer cell lines from RNA-Seq data. We found that LINC00152 was highly expressed in 30 NSCLC cell lines with lower expression in 3 small cell lung cancer cell lines, H146, H526 and H82 (Fig. 3B). We hypothesized that histone deacetylation or promoter DNA methylation may be the mechanisms causing the low expression of LINC00152 in H146, H526 and H82 cells. We then treated these cell lines with 5-aza-2-deoxycytidine (5-AZA) and/or Trichostatin A (TSA) and found that LINC00152 expression was increased by 16-fold in H526 and 8-fold in H146 cells after TSA treatment, whereas, 5-AZA didn't change LINC00152 expression (Fig. 3C), indicating that histone acetylation could be one mechanism causing LINC00152 overexpression in NSCLC. We don't know the reason so far why H146 cells with higher concentrations of TSA (0.5uM) lost the increase in expression of LINC00152. But, we found that these two small cell lung cancers have a different cell growth reaction upon LINC00152 siRNA treatment (We described the results in Supplementary Figure S3). LINC00152 knockdown decreased cell proliferation and colony formation in lung cancer cells. Based on RNA-Seq value in Fig. 3B, all lung cancer cell lines have higher LINC00152 expression level. In order to confirm the expression, we performed RT-PCR for LINC00152 expression on the cells which will be used for proliferation assay. A significant correlation between RNA-Seq and RT-PCR was found using Pearson correlation analysis (Supplementary Figure S1). Small cell lung cancer cell line H526 still is the lowest expression for LINC00152. To minimize the possibility of off-target effects, we used SMARTpool gene specific siRNAs whose knockdown efficiency was greater than 80-90% as determined by qRT-PCR ( Fig. 4A and Supplementary Figure S2). To explore the oncogenic function of LINC00152, we examined cell proliferation status after LINC00152 knockdown by siRNA in 12 lung cancer cell lines. A significant decrease in cell proliferation rate (from 18% to 38%) was identified in 9 out of 12 cells measured by WST-1 ( Fig. 4B and Supplementary Figure S3A). The cell growth of three cell lines, H1975, H1650 and H146, was not affected by LINC00152 knockdown. PC-9 (EGFR mutant and EGFR tyrosine kinase inhibitor (TKI) sensitive cell line) and H838 (EGFR wild type cell line) cells were the most significantly affected (decreased by 36% and 38%, respectively) and were then chosen for colony formation assay. The number of colonies was markedly reduced after LINC00152 siRNA knockdown in PC-9 and H838 cells ( Fig. 4C and D). LINC00152 did not affect cell invasion using Boyden chamber matrix assays in these two lung cancer cell lines (data not shown).
We didn't find a significant correlation between the LINC00152 expression and the inhibition rate of cell growth by LINC00152 knockdown (Supplementary Figure S3B). The cell growth in H1975 and H1650 cells were not affected by LINC00152 siRNA knockdown although these cells have relative higher level of LINC00152 expression. Whereas, the cell growth in H526 was affected although this cell has relative low level of LINC00152 expression. But the cell growth in H146, another low level of LINC00152 expression cell line, was not affected. This indicates that cell growth will be affected if LINC00152 expression reaches a certain level and the genomic background may be the key factor regarding the role of LINC00152 in cell growth.

Proteins/mRNAs regulated by LINC00152 in lung cancer cells.
To provide molecular mechanistic insight into LINC00152 role in regulating lung cancer cell proliferation, we first performed receptor tyrosine kinase (RTK) phosphorylation antibody array analysis, which includes 49 different phosphorylated proteins covering most of the cancer-related pathways. The hosphorylation levels of STAT1 and STAT3 proteins were found to be decreased when LINC00152 was knocked down by siRNA at 72 hours ( Supplementary Fig. S4A and B). In order to confirm and identify more altered proteins regulated by LINC00152, we performed Western blot on several cell growth-related proteins. As indicated in Fig. 5A and B, p38α, STAT1, STAT3, CCNE1, CREB1 and c-MYC proteins were decreased after LINC00152 siRNA treatment in PC-9 and H838 cell lines. The total proteins of p38α and CCNE1 in H838 cell line were not changed indicating a different regulation mechanism for p38α and CCNE1 by LINC00152 between PC-9 and H838 cells. We also performed the mRNA expression of these genes ( Supplementary Fig. S5), we found that STAT3 mRNA was decreased by 30-40% in both cells, whereas CCNE1 mRNA in H838 was increased by 1.5 fold. Other gene mRNAs (p38a, STAT1, CREB1 and c-MYC) were not changed. These results indicated that LINC00152 regulates STAT3 may be through transcription regulation, whereas other proteins, such as p38a, STAT1, CREB1, and c-MYC may be regulated at the post-transcription. The mechanism of these proteins regulated by LINC00152, such as if through lysosome or ubiquitin dependent process, is warranted to be analyzed in the future.
Since EGFR signaling was reported to be involved in LINC00152 promoting cell proliferation in gastric cancer 15 , we performed Western blot on EGFR, AKT and ERK1/2 proteins. We found that these proteins were not changed after LINC00152 siRNA treatment at 72 hours ( Supplementary Fig. S4C) indicating that EGFR signaling was not involved in LINC00152 regulation in these lung cancer cells. In order to evaluate the relationship between p38α and STAT3, we performed p38α and STAT3 siRNA knockdown on PC-9 and H838 cells. As shown in Supplementary Fig. S6, the ratios of p-STAT3/t-STAT3P were increased by 1.6-1.7 fold (vs. NT) in both cells after p38a knockdown and the mRNA levels were not changed indicating that p38α may regulate STAT3 at the protein level. Figure 5E summarized the proteins regulated by LINC00152 uncovered in this study. More experiments such as STAT1, CCNE1, CREB1 and c-MYC expression changing after p38a or STAT3 knockdown or overexpression is warranted.
Finally, we examined the cellular localization of LINC00152. RNAs were isolated from total, nuclear and cytoplasmic fractions of these two cell lines. Quantitative RT-PCR indicated that the expression of LINC00152 was mainly located in the cytoplasm in both PC-9 (67.2%) and H838 cells (67.7%) (Fig. 5C and D). LINC00152 expression primarily in cytoplasm may support that LINC00152 plays roles at post-transcriptional level which is different from a recent report showing LINC00152 interaction with EZH2 and transcriptional control of target genes in A549 and SPCA1 cell lines 25 .

Discussion
As highly tissue-specific drivers of cancer phenotypes, lncRNAs are potentially prime targets for cancer therapy. Many studies have confirmed their utility as biomarkers not only in cancer diagnosis but also for patient prognosis across a variety of cancers [26][27][28][29][30] .
LINC00152 has been reported to be highly expressed in hepatocellular carcinoma, gastric cancer and clear cell renal carcinoma and is involved in the cancer progression [15][16][17][18][19]25 . In the present study, we found that the average levels of LINC00152 in LUAD tissues was significantly higher than those in corresponding normal tissues, and higher expression of LINC00152 was associated with a poor patient survival. These results suggest that LINC00152 may be potentially useful as a marker for lung cancer diagnosis and an indicator of poor survival. We found that knockdown of LINC00152 suppressed tumor cell proliferation and colony formation capability of lung cancer cells but did not affect tumor cell invasion, which was consistent with the results observed in both gastric cancer and hepatocellular carcinomas 15,16 . In different cancer types, and even in different cancer cells, lincRNAs may regulate oncogenesis by different molecular mechanisms. In one study, LINC00152 was found to promote tumor growth through an EGFR-mediated PI3/AKT pathway in gastric cancer cell lines, MGC803 and HGC-27 15 , whereas in another study with gastric cell lines BGC-823 and SGC-7901 cells, LINC00152 promoted GC tumor cell cycle progression by binding to enhancer of zeste homolog 2 (EZH2), thus silencing the expression of p15 and p21 31 . In hepatocellular carcinoma, LINC00152 appears to activate the rapamycin (mTOR) pathway by binding to the promoter of EpCAM through cis-regulation, which was confirmed by the Gal4-λN/BoxB reporter system 16 . In the present study in lung cancer, LINC00152 knockdown leads to reduced several cell growth-related proteins such as STAT3, p38α, CREB1, STAT1, c-MYC and CCNE1. This suggests that the molecular signaling affected by LINC00152 in lung cancer may be different from gastric and liver cancers.
The p38 MAPK pathway is known to regulate gene expression, allowing cells to respond to various extracellular stresses, and different stimuli can consequently activate p38 MAPKs and influence a variety of downstream molecules. CREB1, STAT1 and STAT3 are key factors for p38α signaling 32 . CREB1 influences cell cycle arrest through increasing expression of CCNE1. The p38 MAPK-regulated-CREB1 pathway was shown to contribute to selenite-induced colorectal cancer cell apoptosis in vitro and in vivo 33 . Pretreatment with the p38 inhibitor SB203580, increased expression of p-STAT3 in the lung adenocarcinoma cell line A549 34 indicating that STAT3 was downstream of the ERK and p38 signaling pathways. We found that the ratios of p-STAT3/t-STAT3P were increased by 1.6-1.7 fold (vs. NT) in both PC-9 and H838 cells after p38a knockdown and the mRNA levels were not changed indicating that p38α may regulate STAT3 at the protein level.
STAT3 was shown to directly or indirectly upregulate the expression of genes required for uncontrolled proliferation and survival, including the genes that encode c-MYC, cyclin D1 and cyclin D2, BCL-XL, MCL1 and survivin 35,36 . The increased expression of STAT1 and STAT3 may also positively affect c-Myc, thus precipitating a series of events during oncogenesis including apoptosis inhibition, cell proliferation, angiogenesis and anti-immune responses. We found that STAT3 mRNA and protein were decreased after LINC00152 knockdown indicated that LINC00152 regulated STAT3 may be at the transcription level. Based on our finding that LINC00152 was mainly located in the cytoplasmic of lung cancer cells, the regulation process targeted to p38α, STAT1, CREB1 and c-MYC growth related proteins may be at the post-transcriptional level such as protein degradation mechanism. The detailed mechanism of these proteins regulated by LINC00152, such as if through lysosome or ubiquitin dependent process, is warranted to be further studied.
Taken together, our study demonstrated that LINC00152 was up-regulated in human LUAD, and was correlated with the poor survival of LUAD patients. Its increased expression of LINC00152 might be involved in lung cancer development, and LINC00152 may serve as a potential marker for diagnosis and prognosis. Further characterization of LINC00152 in regulating STAT3, p38a and other proteins may provide a novel therapeutic target of lung cancer.

Materials and Methods
Cell culture. Human lung cancer cell lines were purchased from the American Type Culture Collection (ATCC, Manassas, VA). All cell lines were routinely maintained in RPMI 1640 supplemented with 10% FBS. All cell lines were cultured in a humidified incubator in 5% CO 2 atmosphere at 37 °C. All cell lines were genotyped for identity at the University of Michigan Sequencing Core and were tested routinely for mycoplasma contamination.
Lung tissue samples. Lung cancer tissues were collected from patients undergoing curative cancer surgery from 1991 to 2013 at the University of Michigan Health System. Informed consent was obtained from all patients with experimental protocols receiving approval from the University of Michigan Institutional Review Board and Ethics Committee. The methods were carried out in accordance with approved guidelines. The specimens collected from surgery were freshly frozen in liquid nitrogen and then stored at −80 °C. Frozen tissues for regions containing a minimum of 70% tumor cellularity, defined by cryostat sectioning, were utilized for RNA isolation. All included patients did not receive preoperative radiation or chemotherapy.   Fig. 5A and B, all proteins were decreased after LINC00152 knockdown, but CCNE1 and total p38a proteins were not changed in H838 cells.
Cell proliferation and colony formation. All the cell lines were plated in 96-well plates at the desired density. After plating for 24 hours, LINC00152 siRNA and NT control were transfected with Lipofectamine RNAiMax Reagent in OptiMEM medium. After transfection with 10 nM LINC00152 siRNA for 96 hours, cell proliferation reagents (WST-1) (Roche) were added to each well, and the absorbance was measured at wavelengths of 450 nm and 630 nm, according to the manufacturer's instructions. The cell viability percentages were calculated by normalizing to control siRNA.
Colony formation assay was performed to measure the number of viable cells after LINC00152 knockdown by siRNA. Forty-eight hours after siRNA transfection, PC-9 and H838 cells were trypsinized and counted, and an equal amount of cells (100 or 500 cells/well) were then seeded evenly onto 6-well plates in triplicate. Ten to fourteen days later, colonies were stained with 0.5% crystal violet, and colonies with more than 50 cells were counted.

RNA subcellular isolation and qRT-PCR.
To identify the subcellular localization of LINC00152, all the cell lines were plated in 100 mm plates. The cells were harvested at 80-90% confluence. The cytoplasmic and nuclear RNA were extracted following the instructions for the RNA Subcellular Isolation Kit (Active Motif). Adding complete lysis buffer to each cell pellet and centrifuging each sample, cytoplasmic RNA was harvested from the supernatant, while nuclear RNA was present in the pellet. The cDNA was synthesized with the High Capacity cDNA Reverse Transcription kit (Applied Biosystems). Quantitative RT-PCR was performed using Power SYBR Green master Mix (Thermo Fisher Scientific, USA) and performed with an ABI StepOne Real-Time PCR System (Applied Biosystems). Each sample was analyzed in duplicate. The housekeeping gene GAPDH was used as loading controls.
Microarray and RNA sequencing datasets. One published Affymetrix microarray data set representing 226 primary lung AD tissues 23 was used for survival analysis. The CEL files of microarray data were normalized using the Robust Multi-array Average (RMA) method 37 . We also obtained RNA-Seq data sets from Seo 21 and TCGA 22 consisting of a total of 394 ADs, 212 SCCs and 150 normal lung tissues. Expression levels of transcripts were represented as FPKM 38 . Our primary outcome was overall survival, censored at five years. The information concerning adjuvant chemotherapy or radiation therapy was obtained from the original papers.

Receptor tyrosine kinases (RTK) signaling antibody array and Western blot analysis.
To explore the possible regulation mechanisms of LINC00152, PC-9 and H838 cell lines were seeded on 6-well plates separately at a density of 10,000 cells per well. The cell lysates were collected after LINC00152 siRNA treatment at 72 h. The pathscan RTK antibody array (Cell Signaling) was performed according to the manufacturer's instructions. Western blots were also performed with 10 µg protein using polyacrylamide gel electrophoresis and transfer to nitrocellulose membranes. After being blocked for 1 h with 5% non-fat milk, the membranes were incubated with primary antibodies on a rolling shaker overnight at 4 °C. After incubation with a secondary antibody for 1 hour at room temperature, the membranes were developed using the ECL kit (Amersham, Arlington Heights, IL) and exposed to X-ray film.
Statistical analysis. Data were analyzed using GraphPad Prism 6 (GraphPad software) and R software. To evaluate the diagnostic potential of LINC00152 in LUAD vs. normal, Receiver Operating Characteristic (ROC) curve analysis was used. It showed the tradeoff between sensitivity and specificity (any increase in sensitivity will be accompanied by a decrease in specificity) for the different possible cut-points of a diagnostic test. The diagnostic accuracy was measured by the AUC (area under the curve). An AUC of 1 represented a perfect test; an AUC of 0.5 represented an imprecise test. The data are presented as the mean ± SEM from triplicate experiments and additional replicates as indicated. The significant differences between groups were calculated with Student's t-test or paired t-test. Survival analysis was performed using the Kaplan-Meier method, and the curves were compared using the log-rank test. A p value < 0.05 was considered statistically significant.