The oestrogen receptor alpha-regulated lncRNA NEAT1 is a critical modulator of prostate cancer

The androgen receptor (AR) plays a central role in establishing an oncogenic cascade that drives prostate cancer progression. Some prostate cancers escape androgen dependence and are often associated with an aggressive phenotype. The oestrogen receptor alpha (ERα) is expressed in prostate cancers, independent of AR status. However, the role of ERα remains elusive. Using a combination of chromatin immunoprecipitation (ChIP) and RNA-sequencing data, we identified an ERα-specific non-coding transcriptome signature. Among putatively ERα-regulated intergenic long non-coding RNAs (lncRNAs), we identified nuclear enriched abundant transcript 1 (NEAT1) as the most significantly overexpressed lncRNA in prostate cancer. Analysis of two large clinical cohorts also revealed that NEAT1 expression is associated with prostate cancer progression. Prostate cancer cells expressing high levels of NEAT1 were recalcitrant to androgen or AR antagonists. Finally, we provide evidence that NEAT1 drives oncogenic growth by altering the epigenetic landscape of target gene promoters to favour transcription.


Statistical analysis for quantitative RT-PCR:
For quantitative real time PCR, we subtracted the mean CT value from each gene "g" to the mean control value (HMBS) to compute the Delta CT value: The standard deviation of this value was calculated as the square root of the sum of the squares of the standard deviations: We then computed ∆∆ ! = 2 !(∆!" ! ) for each gene and condition and considered each value to be within this range 2 !(∆!" ! ! !" ∆!" ) , 2 !(∆!" ! ! !" ∆!" ) . Finally we computed the fold changed between condition C1 and C2, e.g. C1=with E2 and C2=without E2 as: The standard deviation of this value was computed according to the rules of propagation of error, i.e.
where !" is the range of ∆∆ ! ! for condition i.

Western Blot:
Cells were fractionated and lysed as described previously. Standard protocols were followed for western blotting 1 . PVDF membranes (GE Healthcare) were used for western blotting and immunoblotted using specific antibodies.

RNA imunoprecipitation (RIP assay):
RIP assays were performed using Milipore EZ-Magna RIP kit (17-701) according to the manufacturer's instruction. Briefly, cells were lysed in RIP lysis buffer, followed by immunoprecipitation with antibody to Histone H3 (ab1791, Abcam, 5µg), Anti-SNRNP70 (CS203216, Millipore, 5µg) and negative control Normal Rabbit IgG (PP64B, Millipore) with protein A/G magnetic beads. The magnetic bead bound complexes were washed to get rid of unbound materials and the RNA was extracted and subsequently analyzed by qRT-PCR.

Peptide pull-down assay:
Nuclear lysates from VCaP and VCaP ERα cells treated with either vehicle or E2 were used in a streptavidin-biotin pull down assay using biotinylated histone peptides 2 . RNA bound to streptavidin beads was recovered using Trizol and reverse transcribed to obtain cDNA. Levels of immunoprecipitated NEAT1 were determined by quantitative PCR.

RNA sequencing:
Standard poly-A selected RNA sequencing was done for VCaP, VCaP ERα expressing cells as well as for VCaP cells overexpressing empty vector and VCaP NEAT1 overexpressing cell lines using Illumina TruSeq RNA-seq protocol.
Reads were aligned to the reference genome NCBI36/hg18 without the minor haplotypes and the minor sequences using STAR aligner 3

ChIP seq data analysis:
Peak detection for all ChIP-seq experiments was performed with ChIPseeqer 5 , using the same parameters for all datasets (i.e., p-value threshold for peaks=10 -5 , minimum distance between peaks=100bp). Genomic annotation of ChIP-seq peaks, comparison between ChIP-seq datasets and motifs analysis were performed using the corresponding tools in the ChIPseeqer software.
We ranked the binding sites according to their p-value as determined by ChIPSeeqer and considered the expression levels of the potential target genes. In this case, we defined a target gene if it is within 20KB of the peak and considered only genes whose expression is higher than 1 in at least one condition (either VCAP con or VCAP ERα).

Association of NEAT1 co-related signature with Oncomine concepts
Similar to the ERα and NEAT1 signature we also created a NEAT1 gene signature considering the genes that were positively correlated (correlation > 0.5) to NEAT1 expression across prostate cancer datasets in Oncomine and looking for overlapping genes with the ERα 588 gene signature we identified 155 overlapping genes. We defined this NEAT1-ERα signature as an Oncomine custom concept and determined significantly associated tumor vs normal concepts with odds ratio > 3.0 and P < 1 × 10 −6 .
These results are represented as a network using Cytoscape version 2.8.2.

Dataset of long non-coding RNAs (lncRNAs):
We generated a reference set of known ncRNAs from various sources, listed hereafter: -

Characterization of the lncRNAs:
We characterized the lncRNAs according to their potential of being regulated by ERα.
Moreover, we considered several histone marks to provide evidence of transcription.

Estrogen receptor alpha (ERα) binding:
ChIP-sequencing data on two prostate cell line (VCaP ERα and NCI-H660) were used in order to identify the binding sites of the estrogen receptor. The data was analyzed by We also considered other transcription binding factors that may act as cofactors in estrogen regulation. Specifically we looked at the Activator Protein 1 (AP-1), which is a regulator of gene expression in response to a variety of stimuli. It is a heterodimer composed by several proteins, among which are c-Jun, c-Fos, and JunD. We thus considered the binding sites of these proteins (from the GM12878 Yale TFBS track in UCSC Genome Browser) and identified which are fully contained in ERα binding sites.

Histone Marks
Similarly to the ERα analysis, we characterized the lncRNAs with respect to active marks (H3K4me3 and H3K36me3), and repressive marks (H3K9me3 and H3K27me3), provided by Chinnaiyan et al. 14 . A window of 10Kb was considered to associate a histone mark to an lncRNA.

RNA extraction, sample preparation and sequencing:
For RNA sequencing analysis frozen tissue was cored (1.5 mm biopsy cores) and RNA extracted using TRIzol Reagent (Invitrogen, CA preparation. Illumina's sample preparation protocol for paired-end sequencing of mRNA was used, as previously described 15 . The paired end reads were then aligned to the human genome (hg18) using ELAND/CASAVA, as described previously 15,16 .

Differential expression analysis:
We performed pair-wise differential expression analysis on a set of paired-end RNAseq samples (26 benign prostate, 40 prostate adenocarcinoma, and 7 neuroendocrine prostate adenocarcinoma) in order to prioritize the experimental validation.
Supplementary dataset 1 shows the details of the sequencing. Briefly, using the Illumina software suite ELAND/CASAVA, reads were simultaneously mapped to the reference genome NCBI36/hg18 without the minor haplotypes and the minor sequences and a splice junction library based on UCSC knownGenes annotation dataset. Reads mapped to the mitochondrial genome were removed. The library was generated using RSEQtools accordingly to the read size 4 . Gene expression (RPKM) was computed on the composite models based on UCSC knownGenes annotation dataset. Raw RPKM values were then log2-transformed after adding 1, and the resulting dataset was quantile normalized, using the R software package "limma" (http://bioinf.wehi.edu.au/limma).
We performed Wilcoxon test for benign vs. PCa and PCa vs. NEPC on the normalized dataset. The candidates in each list were ranked by p-values, after correction for multiple hypothesis testing 17 . Similarly, we computed the RPKM values for the known lncRNAs and performed the same differential analysis on log2-transformed RPKM+1 values.

VCaP and VCaP ERα sequencing:
Strand Specific sequencing was done for VCaP and VCaP ERα expressing cell lines using deoxy-UTP method for library preparation 18 with some modifications. Briefly, Ribo-Zero rRNA removal kit (RZH1046, Epicentre Biotechnologies, Madison, WI) was used to remove ribosomal RNA followed by incorporation of deoxy-UTP during second strand cDNA synthesis. Subsequently the uridine-containing strand was destroyed and that enabled identification of the transcript orientation.
Reads from each strand were aligned to the reference genome NCBI36/hg18 without the minor haplotypes and the minor sequences. Reads mapped to the mitochondrial genome were removed. The expression of each gene (UCSC knownGenes) and the lncRNAs identified by our method was computed using RSEQtools 4 . Specifically, we used bgrQuantifier, after generating the bedgraph files from the mapped data. The reference annotation file (UCSC knownGene annotation or the known lncRNA dataset), which includes the coordinates of all the genes from both strands, was first split according to strand information if available. This avoided assigning reads to the wrong gene on the opposite strand when computing the expression levels. The resulting expression data was used to identify variation in gene expression between the control and ERα expressing cells. We computed the ratio between VCaP and VCaP ERα, after adding 1, and selected those genes with a log2-fold change greater than 2. Results are reported in Supplementary dataset 3.

Preparation of NEAT1 RNA ISH probe:
The full length NEAT1 gene was cloned in pCRII-TOPO (Invitrogen) vector. The plasmid was digested with NotI (New England Biolabs) and purified with the Qiaquick PCR Purification Kit (Qiagen) according to manufacturer's protocol. In vitro transcription was accomplished using 500ng of linearized plasmid DNA and the MEGAscript SP6 kit (Ambion) as directed by the manufacturer. The resultant RNA was cleaned using the RNeasy kit (Qiagen) using the manufacturer's protocol. 5ug of RNA was mixed with 10ul of buffer A, 5µl of ML DNP reagent (both provided by Ventana-Roche) and water to 50µl. The reaction was incubated at 37°C for two hours. The labeled RNA was cleaned with the RNeasy Kit (Qiagen). 50-250ng/ml of probe was mixed in Ribohybe solution (Ventana-Roche) and 100ul of the probe was used for each slide. RNA in situ hybridization on FFPE slides was performed using an automated protocol developed for the Discovery XT automated staining system (Ventana-Roche).