ALDH1A3-regulated long non-coding RNA NRAD1 is a potential novel target for triple-negative breast tumors and cancer stem cells

To discover novel therapeutic targets for triple-negative breast cancer (TNBC) and cancer stem cells (CSCs), we screened long non-coding RNAs (lncRNAs) most enriched in TNBCs for high expression in CSCs defined by high Aldefluor activity and associated with worse patient outcomes. This led to the identification of non-coding RNA in the aldehyde dehydrogenase 1 A pathway (NRAD1), also known as LINC00284. Targeting NRAD1 in TNBC tumors using antisense oligonucleotides reduced cell survival, tumor growth, and the number of cells with CSC characteristics. Expression of NRAD1 is regulated by an enzyme that causes Aldefluor activity in CSCs, aldehyde dehydrogenase 1A3 (ALDH1A3) and its product retinoic acid. Cellular fractionation revealed that NRAD1 is primarily nuclear localized, which suggested a potential function in gene regulation. This was confirmed by transcriptome profiling and chromatin isolation by RNA purification, followed by sequencing (ChIRP-seq), which demonstrated that NRAD1 has enriched chromatin interactions among the genes it regulates. Gene Ontology enrichment analysis revealed that NRAD1 regulates expression of genes involved in differentiation and catabolic processes. NRAD1 also contributes to gene expression changes induced by ALDH1A3; thereby, the induction of NRAD1 is a novel mechanism through which ALDH1A3 regulates gene expression. Together, these data identify lncRNA NRAD1 as a downstream effector of ALDH1A3, and a target for TNBCs and CSCs, with functions in cell survival and regulation of gene expression.

, five of the lncRNAs had RNAseq expression data that was extractable by cBioportal from the TCGA, breast cancer Cell 2015 dataset. The overall survival for the 107 patients with basal invasive ductal carcinoma was plotted based on expression of the lncRNAs and their associated clinical data that was also extracted from cBioportal. The patients were divided into high or low expression based on being in the top or bottom half and the survival plots were generated with Graphpad Prism software. HR = hazard ratio (log-rank). The p-value was calculated based on the log-rank (Mantel-Cox) test. Figure 3. TCGA (Breast Invasive Carcinoma, BRCA dataset) overall survival analysis of lncRNAs enriched in CSC populations. Using the TANRIC portal, expression of the 10 lncRNAs identified in Figure 1E were assessed for correlations with overall survival in 139 basal breast cancer patients that are part of the TCGA-BRCA dataset. The survival plots were generated in TANRIC and exported from TANRIC. The hazard ratio was not included in the analysis by TANRIC. Figure 4. NRAD1 is predominately expressed in basal-like breast cancer cell lines. NRAD1 expression in 21 cancerous, and two normal-like breast cell lines was determined by QPCR. PUM1 and ARF1 are used as reference genes in the panel due to target stability values across all 23 cell lines (n=4). Error bars represent standard deviation. Figure 5. NRAD1 is poorly expressed in most normal human tissues. The GTEx Portal (gtexportal.org) was used to assess NRAD1 expression levels across human tissues. The Genotype-Tissue Expression (GTEx) project is a resource database and associated tissue bank used to study the relationship between genetic variation and gene expression, and other molecular phenotypes, in multiple reference tissues.

Supplemental Figure 6. LINC00162 expression correlation with ALDH1A3 expression in breast cancer patient tumors and cell lines. (A)
RNA-seq co-expression of LINC00162 (PICSAR) and ALDH1A3 in the TCGA Cell 2015 dataset was retrieved with cBioportal. (B) RNA-seq co-expression of LINC00162 and ALDH1A3 in the Cancer Cell Line Encyclopedia (only breast cancer cell lines) was retrieved using the CCLE portal (r = pearson correlation).

Supplemental Figure 7. GAPDH expression in cytoplasmic and nuclear compartments is nearly equal.
Post cellular fractionization, RNA was isolated and cDNA synthesized. The GAPDH levels in the cytoplasmic and nuclear compartments were measured using QPCR in MDA-MB-468 cells. Transcript levels in each compartment is represented relative to the total levels of GAPDH in the cell, which is set to 100% for each n (i.e. levels of GAPDH are nearly equal in both compartments, n =3). Error bars represent standard deviation. Figure 8. QPCR validation of a representative sampling of the microarrayidentified NRAD1-regulated genes in MDA-MB-468, SUM149, and MCF7 cells. Log2 fold change of transcript levels in cells treated with anti-NRAD1-specific GapmeR#3 or #4 versus control GapmeR in MDA-MB-468 cells (A), SUM149 (B), and MCF7 cells (C). Expression is normalized to reference genes PUM1 and ARF1 and represented as fold change over GapmeR control-treated cells (n=4). Error bars represent standard deviation (ND = not detected, i.e. expression levels below quantification threshold).

Supplemental Tables
Supplemental Table 1 Table 3. NRAD1 is non-coding based on five metrics of protein-coding potential.
Using online software lncipedia.org, the coding potential of NRAD1 was assessed by five metrics. The PRIDE reprocessing score analyzes the predicted open reading frames of a sequence against over 100 human proteomics mass spectra; score of 0 indicates no hits. The Lee translation initiation sites are mapped using lactimidomycin, an initiating ribosome inhibitor (compared to a no treatment control); a score of 0 means no difference (i.e. no translation). The PhyloCSF algorithm generates probabilistic models of coding potential based on Codon Substitution Frequencies; a score under +60.00 is likely to be non-coding (lower number increases odds that the sequence is non-coding). CPAT assesses ORF lengths (a long non-putative ORF is unlikely to be observed by random chance in a non-coding sequence) and ORF coverage (the ratio of ORFs to transcript lengths). Bazzini Small ORFs tests if small ORFs in the sequence are translated through ribosomal profiling. A score of 0 indicates no translation detected.