The long noncoding RNA H19 regulates tumor plasticity in neuroendocrine prostate cancer

Neuroendocrine (NE) prostate cancer (NEPC) is a lethal subtype of castration-resistant prostate cancer (PCa) arising either de novo or from transdifferentiated prostate adenocarcinoma following androgen deprivation therapy (ADT). Extensive computational analysis has identified a high degree of association between the long noncoding RNA (lncRNA) H19 and NEPC, with the longest isoform highly expressed in NEPC. H19 regulates PCa lineage plasticity by driving a bidirectional cell identity of NE phenotype (H19 overexpression) or luminal phenotype (H19 knockdown). It contributes to treatment resistance, with the knockdown of H19 re-sensitizing PCa to ADT. It is also essential for the proliferation and invasion of NEPC. H19 levels are negatively regulated by androgen signaling via androgen receptor (AR). When androgen is absent SOX2 levels increase, driving H19 transcription and facilitating transdifferentiation. H19 facilitates the PRC2 complex in regulating methylation changes at H3K27me3/H3K4me3 histone sites of AR-driven and NEPC-related genes. Additionally, this lncRNA induces alterations in genome-wide DNA methylation on CpG sites, further regulating genes associated with the NEPC phenotype. Our clinical data identify H19 as a candidate diagnostic marker and predictive marker of NEPC with elevated H19 levels associated with an increased probability of biochemical recurrence and metastatic disease in patients receiving ADT. Here we report H19 as an early upstream regulator of cell fate, plasticity, and treatment resistance in NEPC that can reverse/transform cells to a treatable form of PCa once therapeutically deactivated.

Neuroendocrine (NE) prostate cancer (NEPC) is a lethal subtype of castration-resistant prostate cancer (PCa) arising either de novo or from transdifferentiated prostate adenocarcinoma following androgen deprivation therapy (ADT). Extensive computational analysis has identified a high degree of association between the long noncoding RNA (lncRNA) H19 and NEPC, with the longest isoform highly expressed in NEPC. H19 regulates PCa lineage plasticity by driving a bidirectional cell identity of NE phenotype (H19 overexpression) or luminal phenotype (H19 knockdown). It contributes to treatment resistance, with the knockdown of H19 re-sensitizing PCa to ADT. It is also essential for the proliferation and invasion of NEPC. H19 levels are negatively regulated by androgen signaling via androgen receptor (AR). When androgen is absent SOX2 levels increase, driving H19 transcription and facilitating transdifferentiation. H19 facilitates the PRC2 complex in regulating methylation changes at H3K27me3/H3K4me3 histone sites of AR-driven and NEPC-related genes. Additionally, this lncRNA induces alterations in genome-wide DNA methylation on CpG sites, further regulating genes associated with the NEPC phenotype. Our clinical data identify H19 as a candidate diagnostic marker and predictive marker of NEPC with elevated H19 levels associated with an increased probability of biochemical recurrence and metastatic disease in patients receiving ADT. Here we report H19 as an early upstream regulator of cell fate, plasticity, and treatment resistance in NEPC that can reverse/transform cells to a treatable form of PCa once therapeutically deactivated. https://doi.org/10.1038/s41467-021-26901-9

OPEN
A full list of author affiliations appears at the end of the paper. N euroendocrine prostate cancer (NEPC) is a highly aggressive and lethal subtype of prostate cancer (PCa) capable of widely metastasizing to organs and bone 1 . Patients with NEPC have limited therapeutic options, and the median overall survival is~7 months to~4 years from the time of diagnosis 2,3 . While the disease can develop de novo (dNEPC) 4 , it occurs primarily after treatment (tNEPC) arising by a complex process of neuroendocrine transdifferentiation (NEtD) of prostate adenocarcinoma (AdPC). This cellular transformation results from selective pressures from potent androgen receptor (AR) pathway inhibition in castration-resistance prostate cancer (CRPC) [5][6][7][8] . With the introduction of highly potent AR-targeting agents, the incidence of tNEPC is increasing 3,9,10 . Manifestations of this subtype include low levels of prostate-specific antigen (PSA) secretion, indifference to AR pathway inhibition, reduced AR protein expression, and the presence of lytic bone lesions and visceral metastasis [11][12][13] . NEPC can present with either a small or large cell phenotype and is identified histologically by its tumor morphology and expression of neuroendocrine (NE) markers, chromogranin A (CHGA), synaptophysin (SYP), and neuron-specific enolase (NSE) 11,14 .
H19 is predominantly active during fetal development 30 , with the highest expression in developing skeletal and smooth muscles. It is encoded by the H19/IGF2 imprinted gene cluster located on human chromosome 11p15.5. Defined as an oncofetal gene 31 , H19 becomes downregulated during tissue maturation but can be re-expressed in cancer 32 . H19 has both oncogenic and tumor suppressor functions in multiple cancer types 33 . Within these tumors, it has a regulatory role in a range of biological processes associated with tumor growth, including genome destabilization, hypoxia, epithelial-to-mesenchymal transition (EMT), and mesenchymal-to-epithelial transition (MET) 33,34 .
In the present study, we provide insights into the role of H19 in NEPC while confirming its significant abundance in multiple NEPC clinical cohorts. Experiments demonstrate that H19 functions as a driver of lineage plasticity in PCa and can induce an NEPC-like phenotype. Knocking down H19 reverses this process and induces a luminal-like phenotype with increased sensitivity to androgen deprivation therapy (ADT). Here, we demonstrate that H19 functions as an epigenetic regulator in NEPC by binding to the PRC2 complex, modulating H3K27me3 and H3K4me3 histone marks. H19 also prompts alterations in genome-wide DNA methylation on CpG sites. Collectively, this remodels chromatin near AR signaling (ARS) and NE genes, regulates the methylation of ARS and NE gene promoters, and consequentially regulates ARS and NE expression. Therefore H19 plays an essential epigenetic role in NEPC. Our clinical data reveals that H19 can be used as a diagnostic biomarker for NEPC and is a predictive marker for biochemical recurrence and metastasis in the context of ADT.
Expression of H19 was highest in NEPC and was significantly increased in all cohorts compared to control samples (Fig. 1A, B, Supplementary Data 2, p < 0.05), except the dNEPC (BCCA) cohort. It is unclear if this resulted from the high tumor cellularity in matched benign samples or disease pathology. In some cohorts (WCM1 and WCDT), H19 was the topmost differentially expressed lncRNA of all annotated noncoding RNAs across the entire transcriptome. Increasing expression levels of H19 were associated with increasing Gleason grade and treatment status ( Fig. 1A-VPC). H19 expression was lowest in low-risk (Gleason<7), higher in Gleason>7 (p < 0.01) PCa, further increased in neoadjuvant hormone-treated (NHT) patients (p = 0.0886), and most highly expressed in NEPC (p-value < 0.01).
To support the involvement of H19 in NEPC, we calculated pairwise correlations using Pearson correlation of H19 with NEPC markers in the WCM1 cohort (Fig. 1C, Supplementary  Fig. 4A). Positive correlations were observed between H19 and classical NEPC markers CHGA, CHGB, and SYP, as well as the NEPC oncogene MYCN (R = 0.59, p = 0.00012; R = 0.78, p = 1.5e−08; R = 0.70, p = 1.2e−06; R = 0.72, p = 5.2e−07, respectively). Negative correlations were observed between H19 and AR and SPDEF (Fig. 1C, Supplementary Fig. 4A), both deactivated in NEPC (R = −0.65, p = 1.2e−05, R = −0.52, p = 0.00087, respectively). Due to each cohort's rarity and small sample sizes, we amalgamated (see Methods for analysis and normalization) all our sequenced clinical NEPC cohorts. Unsupervised hierarchical clustering of H19 along with 38 known NEPC genes/lncRNAs within this merged cohort showed distinct separation of AdPC and NEPC samples with no clear cohort bias (Fig. 1E). We next searched for correlations with previously identified TFs and oncogenes of NEPC to identify putative H19 regulators/targets. Interestingly, we found SOX2 and EZH2 as significantly, positively correlated ( Fig. 1C-R = 0.61, p = 7.2e −05; R = 0.67, p = 4.7e−06). To investigate how H19 associates with NE activation and AR deactivation, we compared its expression to available NEPC and AR signature scores in WCM1. We observed a strong positive correlation to NEPC and a negative correlation to AR signature scores (Fig. 1D, R = 0.75, p = 7.2e−08; R = −0.73, p = 2.8e−07). To investigate how this result would change across CRPC phenotypes, we repeated our correlation analyses to include WCM2 samples (WCM1 + WCM2) and found highly significant correlations to individual genes and gene signatures identified previously ( Supplementary Fig. 4B-D).
Large-scale testing is required to establish if a dysregulated gene is clinically recurrent (i.e., present in a larger population pool). With the infrequency of NEPC and the scarcity of biospecimens, testing this observation is challenging. However, recently a cohort of 26,245 prospective PCa samples (Decipher GRID), using an NEPC fingerprint of 212 genes (small cell genomic signature-SCGS), identified a subset of patients with transcriptomes analogous to NEPC 35 . With none of these patients having received treatment, we hypothesized these cases were representative of dNEPC or AdPC at high risk for lineage plasticity when exposed to ADT. Therefore, using the SCGS signature and isolating patients within the top and bottom 1% percentile of scores, these patients were classified into NEPC (n = 263) and AdPC phenotypes, respectively. After affirming these samples were molecularly concordant to NEPC ( Supplementary  Fig. 3), we observed that H19 had the highest differential expression in NEPC vs. AdPC among these genes ( Fig. 1B-GRID). Taken together, this data supports the observation that lncRNA H19 is associated with increasing Gleason grade, treatment status, and the NEPC phenotype.
H19 is functionally conserved, and the longest isoform is predominant in NEPC. To support the potential biological importance of H19, we tested its level of conservation across eutherian species. Of the 70 species examined, 47 had DNA alignments and synteny blocks for the human H19 locus. Performing a multiple sequence alignment (MSA) with these sequences, a high degree of alignment for the region encompassing the five core exons of H19 ( Fig. 2A-right side of MSA, Supplementary  Fig. 5A) was observed. Previous studies have suggested that the secondary structure of lncRNAs is conserved, signifying their functional biological role 36 . We modeled H19 secondary structure and observed a stable minimum free energy (MFE) structure (Supplementary Fig. 6A-D) conserved across our test species. We analyzed a~1000 nt region spanning the five core exons of H19 from our MSA and MFE structure ( Supplementary Fig. 7) and mostly observed a low covariance (−1 to −2) in the structure across this segment and when analyzing a smaller segment (250 nt) saw a similarly low rate of covariance (Fig. 2B). This data supports that H19 is not only conserved in its primary sequence but, despite minor sequence differences ( Fig. 2B-non-green colored blocks), is also conserved in its secondary structure. Investigating the human form of H19 more deeply revealed numerous cancer-related SNPs, with~55% (12/22)  H19 has a diverse range of reported functions 33 that could occur due to alternative splicing and usage of specific isoforms in different cellular contexts. H19 has 13 annotated isoforms (H19i to H19xiii- Fig. 2C). With increased coverage and depth in our VPC cohort, we quantified all isoforms and plotted each in decreasing order based on log2 mean (m) expression (Fig. 2D, Supplementary Fig. 5B). This metric and ranking supported the longest isoform (H19i) as the dominant isoform in tNEPC with >4× fold mean expression across all samples. This result was validated when looking across our other NEPC samples, matching the order from Fig. 2D (Fig. 2E, Supplementary Fig. 5C). Other isoforms of H19 could be relevant in NEPC, yet their relatively low expression likely results in a nonfunctional role.
H19 is elevated in pre-clinical models of NEPC. To experimentally correlate the level of H19 with the NEPC phenotype, we evaluated the expression of this lncRNA in NEPC patient-derived  All cohort samples and their associated pathology, Gleason grading, clinical features, and outcomes, including matched benign (BE), adenocarcinoma (AD), neoadjuvant hormonally treated adenocarcinoma (AD NHT), castrated-resistant prostate cancer (CRPC), mixed pathology including adenocarcinoma and small cell (MX-P), neuroendocrine prostate cancer (NEPC), de novo NEPC (dNEPC), treatment-induced (tNEPC), Gleason organoids OWCM-154, -155, -1078, and -1262 37 . H19 expression was markedly elevated in 3 of the 4 NEPC organoids compared to organoids derived from normal prostate or PCa patients who underwent primary tumor resection for AdPC (Fig. 3A). These samples also exhibited increases in stem cell and NE markers in NEPC vs. non-NEPC samples ( Supplementary Fig. 8A). In addition, the NEPC cell lines, NCI-H660 and LASPC-01, compared to AdPC cell lines C4-2B, VCaP, and LNCaP, express elevated H19, stem cell genes, and NE markers (Fig. 3B Fig. 8D). The methylation percentage was quantified by coupled abscription PCR signaling (CAPS) (Supplementary Methods, Supplementary Fig. 6D). Results indicate that the AdPC cells (LNCaP, C4-2B, and V16D) have a higher percent of ICR1 methylation when compared to the NEPC cell line (NCI-H660) and NEPC organoid  In the VPC cohort (n = 75), rising levels of H19 expression are seen across increasing Gleason grades (Gleason grading ≦ 6 = AD Low and Gleason grading ≧ 8 = AD High), including NHT treated samples and peaks in NEPC. Significant upregulation of H19 is observed in mixed Gleason grading (MX-G) adenocarcinoma vs. NEPC in the WCM1 cohort (n = 37) and CRPC vs. NEPC samples of WCM2 (n = 49) and WCDT (n = 45) cohorts. Non-significant (ns) yet the elevated expression of H19 is observed in benign (BE) vs. dNEPC of the BCCA cohort (n = 15), yet possibly due to high tumor cellularity in matched BE samples. B Microarray NEPC clinical cohorts. Similarly, in the JHMI (n = 33) and GRID (n = 526) cohorts, rising levels of H19 from AD High/AD MX-G to mixed AD and small cell pathology (MX-P) to NEPC are observed. Box and Whisker plots display lower quartile, upper quartile, and median bounds of cohort expression at the box's minima, maxima, and centerlines, respectively. Whisker lines display lower (bottom) and upper (top) extreme value ranges. Single data points represent outliers in a cohort. p Values were calculated by an unpaired two-sided Student's t test. Significance was represented by *p < 0.05; **p < 0.01; ***p < 0.001 and ****p < 0.0001 unless specifically noted. C In WCM1, H19 shows a significant positive correlation with CHGA/B, SYP, SOX2, and EZH2 and shows a significant negative correlation with AR expression. D Again in WCM1, H19 shows a significant positive and negative correlation to known NEPC and AR gene signatures, respectively. Correlation coefficients (R) and p values (p) were calculated using a Pearson correlation statistical test. The shaded area represents confidence intervals at 95%. (OWCM-1262), both of which demonstrate methylation in the normal imprinting range of 40-60% (Fig. 3C). Similar results were observed within the ICR1 locus of our LTL331 NEPC xenograft model ( Supplementary Fig. 11B). These results suggest that the elevated level of H19 seen in NEPC could be secondary to changes in methylation of the imprinting center (ICR1) compared to AdPC.
Approximately 70% of NEPC patients harbor mutations or deletions in TP53 and RB1 27 . In mouse models, Trp53 mutation cooperates with Rb1 loss to induce Ar low , Syp high NE-like tumors resistant to orchiectomy-induced androgen deprivation 16 . To evaluate whether the Trp53/Rb1 knockout modulated H19 expression, organoids derived from Trp53 flox/flox /Rb1 flox/flox mouse prostates were transduced with lentivirus expressing Cre recombinase (Cre-GFP), generating double-knockout (DKO) organoids ( Supplementary Fig. 8E). Organoids transduced with the lentivirus EV-GFP were used as control. These DKO organoids demonstrated enhanced expression of H19, markers of the NE phenotype (Chga, Nse, Syp, Brn2, and Ascl1), and stem cell genes (Klf4, Oct4, and Sox2) (Fig. 3D) 16 . In comparison,    Supplementary Fig. 5. The highlighted box (red) spans ã 250 nt region where we narrow/zoom into for a B detailed MSA (horizontal bars) and integrate RNA secondary structure (purple arcs). Each bar/row represents a test species and is in the order shown from the phylogenetic tree in (A and Supplementary Fig. 5). Each arc represents a nucleotide base pairing and binding event. Most of this region is highly conserved (green), yet despite areas with sequence differences (blue, light blue, orange, or gray), these changes do not affect H19's secondary structure, which is stable and contains little covariation (denoted by arc colour). C Schematic of human chromosome 11 (top), H19's locus and neighboring genes (middle), and Ensembls' 13 annotated isoforms and exon-exon structures (bottom)-H19i through H19xxii. The order and number of isoforms were assigned based on expression ranking observed in panels (D) and (E  Fig. 8F).
Organoids derived from mice with other genetic mutations commonly found associated with NEPC, including Trp53/Pten DKO and Rb1/Pten DKO, also demonstrated induction of H19 ( Supplementary Fig. 8G, H). Similar results could be seen using a model system in which NEtD is induced by incubating the hormone-dependent cell lines, LAPC-4 and LNCaP, in a stem transition medium that differentiates these cells into an NEPClike stem-like state 10,39 ( Supplementary Fig. 8I). While induction of NEtD significantly increased H19 in these cells, levels returned to baseline when the cells were placed back in normal serum. Thus, H19 is elevated in NEPC in vitro models and is modulated by critical drivers of NEtD.
To examine whether stem cell genes might regulate H19 levels in NEPC, SOX2, NANOG, and OCT4 were individually knocked down in LNCaP cells with previously deleted TP53/RB1. Results demonstrated that decreasing each stem cell gene caused a significant reduction in H19 expression ( Supplementary Fig. 9G). Furthermore, as demonstrated previously 40,41 , the knockdown of each stem cell gene caused changes in mRNA level of the other two genes investigated ( Supplementary Fig. 9G). This finding suggests a possible feed-forward mechanism that controls the levels of this lncRNA within the cell.
H19 knockdown reduces cell proliferation, invasion, and resensitizes resistant cells to enzalutamide. NEPC is characterized by highly proliferative cells with increased metastatic potential 42 . Stable H19 knockdown inhibited OWCM-155 proliferation in vitro (Fig. 4A). Similarly, Trp53/Rb1 DKO murine organoids transduced with shH19 showed markedly suppressed growth (p = 7 × 10 −9 ) (Fig. 4B, C). Similar growth suppression was seen with the knockdown of H19 in LNCaP-SL and 42F ENZR cells ( Supplementary Fig. 10A, B). Furthermore, subcutaneous injection of organoids with H19 knockdown (OWCM-155-shH19) in mice demonstrated significantly slower growth with reduced tumor weight and volume as compared to mice injected with control (OWCM-155-shSCR) organoids ( Fig. 4D, E, Supplementary Fig. 10C). In addition, tumors containing H19 knockdown were qPCR validated with reduced H19 and NE markers (Fig. 4F). These results indicate that H19 is critical both for NEPC growth and differentiation. To examine the ability of H19 in modulating tumor cell invasion, dissociated mouse Trp53/Rb1 DKO organoid cells were placed in a transwell with Flourblock inserts (Fig. 4G), and H19 knockdown was shown to inhibit invasion. Similar results were seen in LNCaP-SL cells (Supplementary Fig. 10D). Together these data confirm that a decrease in H19 affects the growth and invasive potential of NEPC.
Knockdown of TP53/RB1 in LNCaP cells allows them to become NEPC-like, showing less sensitivity to growth inhibition by the AR antagonist enzalutamide (ENZA) 16 . LNCaP cells containing shTP53/Rb1 are resistant to ENZA-induced growth blockade and instead undergo NEtD. Interestingly, when these cells are transduced with shH19, they regained their sensitivity to hormone blockade (Fig. 4H) and demonstrate growth inhibition by ENZA. Western blots demonstrated that H19 knockdown (shH19-C and shH19-D, Supplementary Fig. 9H) abrogated this ENZA induced NEtD, reducing the protein levels of NE markers (Fig. 4I). Conversely, ENZA (2 μM or 5 μM) treatment for 5 days inhibited the growth of control LNCaP cells, but not those with stable overexpression of H19 (Fig. 4J, K). However, we did observe reduced AR levels with ENZA treatment in H19 overexpression LNCaP cells, indicating the complexity of these molecular pathways. Together these data highlight the importance of H19 in regulating the sensitivity of PCa cells to ADT.
Androgen deprivation and stem cell genes regulate H19 transcription during NEtD. NEtD is a complex process involving suppressed androgen signaling followed by a stem cell state that allows the cells to dedifferentiate NE phenotype. Using existing RNA-sequencing and microarray expression from the LTL331/ LTL331R PdX models of NEtD 19 , we sought to identify when H19 is expressed. Analysis of RNA collected during different phases of NEtD (AdPC, post-castration, and NEPC) (Supplementary Fig. 11A) revealed that H19 transcription is elevated in a biphasic manner, the first increase occurring post-castration and a second during the terminal differentiation to NEPC (Fig. 5A). In this model, the induction of SOX2 occurs primarily in the second phase of NEtD ( Supplementary Fig. 11E).
Since H19 levels increased post-castration, further analysis of the relationship between H19 and the AR was undertaken. Our computational analysis revealed an inverse correlation between H19 and AR expression ( Supplementary Fig. 12A). Consistent with this observation, knockdown or overexpression of H19 in NCI-660, LASPC-01, and V16D CPRC , showed a strong inverse correlation between the PSA (RNA) and H19 ( Supplementary Fig. 12C). To test the direct effects of AR signaling, C4-2B cells were treated with an AR agonist dihydrotestosterone (DHT) or an antagonist, ENZA. The addition of DHT suppressed H19 expression (Fig. 5B), while ENZA increased H19 levels (Fig. 5C). Moreover, long-term androgen-deprived LNCaP cells that became neuronal-like and expressed increased NE markers, demonstrated elevated H19 levels ( Fig. 5D). To study the mechanism by which AR regulated H19, chromatin immunoprecipitation (ChIP) was performed with the anti-AR antibody on C4-2B cells treated with DHT or vehicle. ChIP-qPCR of the H19 upstream region demonstrated three ARE binding sites (529, 860, 2284 bp) upstream of the H19 transcription start site (TSS) (Fig. 5E). These binding sites were significantly enriched for AR binding after DHT treatment ( Supplementary  Fig. 12D), whereas loss of AR occupancy was demonstrated upon ENZA treatment (Fig. 5F). These findings were validated using KLK3 as a positive control (Fig. 5F, Supplementary Fig. 12D). Using a construct with H19 proximal promoter (H19-PP, 850 bp) driving a luciferase reporter, experiments further demonstrated that in C4-2B cells, DHT was capable of decreasing reporter activity. Conversely, ENZA treatment increased H19 transcription, thus confirming that the proximal promoter region is involved in AR regulation of H19 transcription (Fig. 5G).
Androgen deprivation has been shown to elevate the levels of SOX2 16 . This TF regulates lineage plasticity and the induction of NEtD, suggesting that it could also play a role in regulating H19 levels. Experiments demonstrated that SOX2 overexpression in LNCaP cells increased H19 expression while knockdown of SOX2 in C4-2B cells decreased H19 transcription (Fig. 5H). Likewise, ENZA treatment of C4-2B cells increased both SOX2 and H19 expression ( Supplementary Fig. 12E). To examine the relationship between AR signaling, SOX2, and H19, ChIP was carried out with anti-SOX2 antibody on C4-2B cells treated with ENZA (10 µM) or vehicle (EtOH) for six days to induce NEtD. Chip-qPCR results demonstrated significant enrichment of SOX2 binding on 2 of the putative SOX2 sites (239, 1563 upstream of H19 TSS), one of them was found close to the H19 TSS (Fig. 5I). Furthermore, in 42D ENZR cells, ENZA treatment induced an increase in luciferase activity upon transfecting the wild type H19-PP luciferase construct (850 bp), which was blocked by mutating the SOX2 binding site close to the H19 TSS (Fig. 5J). This result confirms the ENZA-mediated SOX2 regulation of H19 transcriptional activity. Together, these experiments point to a mechanism in where androgen signaling initially suppresses H19 transcription, ADT alleviates this due to augmented SOX2 levels, and therefore H19 transcription is further elevated (Fig. 5K).
H19 induces epigenetic changes including modifying histone methylation by binding to the PRC2 complex. The subcellular localization of a lncRNA can guide the identification of function. We observed a significant level of nuclear expression of H19 relative to the cytoplasm (Fig. 6A), suggesting that in NEPC, H19 might predominantly function to regulate gene transcription. Epigenetic reprogramming has been implicated in the NEPC development 20,43 . Our experiments demonstrated that the level of H3K27me3, the target of the PRC2 complex, and H3K4me3 was elevated in NEPC (Fig. 6B) compared to AdPC. The transfection of H19 into LNCaP and V16D CRPC induced H3K27 and H3K4 trimethylation (Fig. 6C). In addition, the transduction of the NEPC organoid OWCM-155 with Lv-shH19 markedly decreased H3K27me3 while only slightly reducing EZH2 expression (Fig. 6D). Together these data indicate that H19 plays a role in the epigenetic changes induced during NEtD. H19 overexpression in LNCaP was shown not to alter the expression levels of PRC2 complex proteins, EZH2, SUZ12, and AEBP5 (data not shown), whereas RNA immunoprecipitation (RIP) in LASCPC-01 ( Fig. 6E) and NCI-H660 ( Supplementary Fig. 13A) demonstrated the binding of EZH2 to H19. RIP analysis of LNCaP and V16D CRPC cells overexpressing H19 confirmed that the H19 transcript was enriched in the immunoprecipitation of endogenous EZH2 (middle) and SUZ12 (right) (Fig. 6E). It has been reported that the association of H19 with EZH2 at a specific region in the 5′ end of the lncRNA is responsible for PRC2 activity 44 . To test whether this interaction is essential for NEtD, we created an EZH2 binding site deletion fragment (H19 DEL , Supplementary Fig. 13B), with a 5′ deletion, and cloned it into a DOX inducible Tripz vector. The addition of DOX to these cells induced NE marker expression in H19 FL but not in the H19 DEL transfected LNCaP cells (Fig. 6F, Supplementary Fig. 13C, D). These results establish the functional importance of H19/EZH2 binding in mediating NEtD.
To examine the clinical relevance, we compared the differentially methylated gene set (shH19 vs. control) with a previously established methylation gene set, which compared clinical samples of NEPC vs. AdPC 45 (Supplementary Data 16). After H19 knockdown, the 541 genes with hypermethylation and 260 genes with hypomethylation were found to have reversed methylation status (Fig. 8C)   et al. This reversal in methylation status (Adj. p-value < 0.001) was found for genes associated with an NEPC signature 45 including RGS7, CCND1, SPDEF, GATA2; AR signaling genes, TMPRSS2, PMEPA1; NE phenotype genes, CHGA, and cell fate commitment genes, ASCL1, HES5, KLF4, POU4F1 (Supplementary Data 16). Functional association analysis with GO analysis demonstrated that genes with hypermethylation are enriched for cell migration and neuron generation (p-value < 0.001, e.g., HMX1, ENPEP, SOC2, and P2RY6). Conversely, the GO pattern of those genes with hypomethylation fit into pattern specification processes, sequence-specific DNA binding, and negative regulation of cell differentiation (Fig. 8C, e.g., RGS7, SPDEF, DLL3 and NKX2.1). The reversal of the methylation of these genes was present in the chromosomal loci observed from Beltran et al. 45 ( Fig. 8D). Using the RNA extracted from organoids analyzed for methylation changes, gene expression was found to be decreased for hypermethylated genes, e.g., P2RY6, SOCS3, or increased for hypomethylated genes, e.g., RGS7, ERG, CCND2, CDH4 (Fig. 8E). These results strongly point to the role of H19 in regulating the methylation of genes associated with the NEPC phenotype.
H19 is a putative diagnostic and predictive biomarker for NEPC. Currently, there is an unmet clinical need for reliable diagnostic, prognostic, and predictive biomarkers for NEPC 45,49,50 . To test whether H19 or other recently identified genes associated with NEPC were diagnostically useful, we analyzed sequenced clinical NEPC samples (n = 50, Table 1). The AdPC samples from this study were used as a control. For each NEPC sample and test gene, sensitivity and specificity analysis were performed. A gene/lncRNA was considered expressed or not expressed if it was >1 or <1 standard deviation from the control groups' expression, respectively. We tested genes related to ARactivity (AR, KLK2, and PCA3), commonly used NEPC markers (CHGA, SYP, and NSE), and four candidate test genes (BRN2, SRRM4, PEG10, and H19) (Fig. 9A). As expected, many samples showed no AR expression or activity. Concerning NEPC markers, SYP had the greatest sensitivity (90%), and NSE had the greatest specificity (87%). Notably, the four test transcripts, BRN2, SRRM4, PEG10, and H19, had sensitivities of 86%, 88%, 92%, and 86%, respectively, and performed similarly to NEPC markers. However, with their relatively lower specificities, they would best , H19 positively detected NEPC. This data supports incorporating these "next-generation NEPC markers", including H19, to be used clinically to enhance the ability to detect NEPC. It is estimated that 20-30% of metastatic CRPC tumors develop tNEPC 51 . We explored whether H19 might predict the clinical outcome using the Decipher GRID database, focusing on AdPC samples from the MCII cohort (n = 232, Table 1). MCII represents tumors primarily with unfavorable pathology (i.e., high grade/stage) and long-term follow-up for treatment and outcomes (median 18 years). This cohort contains samples treated with radiotherapy (RT), adjuvant ADT, or post-radical prostatectomy 34,52 . We performed survival analysis to establish H19's ability to stratify patients with an increased probability of biochemical recurrence (BCR) or metastasis (MET) as their clinical end-point. ADT-treated patients were grouped by H19 expression into tertiles. We observed that samples with the highest tertile of H19 expression vs. mid or low levels had a significantly higher probability of BCR or MET (Fig. 9C-pvalue = 0.00996 and p-value = 0.0162, respectively). With H19 expression stratified in the same manner, we also generated Kaplan-Meier curves in untreated patients. Unlike ADT-treated patients, untreated patients did not significantly differ in the probability for BCR nor MET (Fig. 9B-p-value = 0.627 and pvalue = 0.880, respectively). Taken together, in patients that receive ADT and consequently have a higher probability of tNEPC, an elevated H19 level is a predictive biomarker for poor survival-related outcomes.

Discussion
We discover the lncRNA H19 as a driver of PCa lineage plasticity and induction of the NE phenotype. 122 lncRNAs distinguish NEPC from AdPC, and H19 is one of the most highly expressed lncRNAs within this signature 28 . Importantly, elevated levels of H19 are shown to be associated with higher Gleason grade and neoadjuvant hormone therapy, suggesting its role in PCa progression. These findings are experimentally validated in pre-  clinical models of NEPC, including patient-derived NEPC organoids, and murine Trp53/Rb1, Trp53/Pten, and Rb/Pten DKO organoids. In addition, our sequence analysis identifies the longest isoform of H19 as the active variant in NEPC and highly conserved, both in sequence and secondary structure across multiple species. NEtD resulting in a transition of CRPC to NEPC occurs through an intermediary stem-like state in which cells exhibit EMT and stem cell-like features 39 . This lineage plasticity is thought to be reversible. Our data confirmed the potential for H19 to be a central regulator of this bidirectional phenotype creating a stem cell-like permissive environment for lineage plasticity. This is shown by H19 leading to concomitant changes in NE gene expression, while the reduction of H19 expression in Trp53/Rb1 DKO organoids induced a lineage reversal from a NElike to a luminal-like phenotype. This suggests that H19 aids cells in acquiring a stem cell-like state that may be essential for lineage plasticity and NEtD. Extensive study of miR-675, a mediator for H19 function and hosted within H19, has been previously carried out 53 . However, our results in murine prostate organoids indicated that elevated miR-675 does not alter the expression of NE genes ( Supplementary Fig. 14B-D), suggesting that H19 does not function via miR-675 in our system. The observation that persistent nuclear AR expression and median serum PSA levels (>60 ng/ml) occur in a subset of small cell NEPC patients suggests that the AR is only part of the control mechanism regulating NEPC function 3 . Lineage reversal mediated by manipulating H19 and EZH2 to restore AR signaling might result in a therapeutically targetable phenotype derived from this aggressive disease.
The control of H19 transcription appears to be complex. The role of androgens in controlling H19 has previously been reported 54 , and we corroborated that H19 transcription is suppressed by androgen. During ADT, SOX2 levels rise and bind to the H19 promoter, increasing H19 transcription, potentiating the induction of the NE phenotype. The combined loss of TP53 and Rb1 inducing H19 expression was consistent with the observation that TP53 and RB1 control the level of SOX2 in PCa cells 16   our data showed that knockdown of SOX2 or OCT4 decreased H19 levels, a feed-forward mechanism may exist in PCa, in which increases to stem cell factors enhance the transcription of H19, and then H19 drives further elevation in these stem cell genes. The promoter region of H19 contains multiple putative TF binding sites, including TP53, E2F, and HIF1α [55][56][57][58] , which could also regulate this lncRNA. Recently, we have shown in multiple cancers that the Pim kinases regulate H19 levels suggesting Pim could play a role in NEtD 59 .
H19 is a maternally imprinted gene with its expression closely linked to IGF2 through regulation mediated by the ICR1 locus. Studies have shown the loss of H19/IGF2 imprinting in cancer 60,61 . Our methylation assay did not find a loss of imprinting in NEPC (Fig. 3C), and increased IGF2 expression in NEPC was detected compared to H19 (Supplementary Figs. 11F, 14A, and 15A-D). Positive correlation patterns within our clinical samples ( Supplementary Fig. 15E) suggest these genes are coregulated. However, in bladder cancer, SOX2 stimulates IGF2 expression 62 , and with SOX2 elevation occurring due to androgen withdrawal, this may further elevate the transcription of IGF2 in NEPC.
Because EZH2 plays a prominent role in NEPC regulation 63-65 , we examined whether H19 may control the levels of a diverse set of genes by interacting with the PRC2 complex. Our data demonstrated that H19 overexpression in multiple PCa cell types increased H3K27me3 and H3K4me3 levels. In addition, we showed that H19 could bind PRC2 complex members, suggesting a partial mechanism for this effect. LncRNAs, e.g., HOTAIR, can bind to the PRC2 complex and interact with LSD1, a demethylase for H3K4me2, leading to gene activation 66 . However, in preliminary experiments, we have not seen direct binding of LSD1 by H19. Similar to HOTAIR, H19 may also function as a modular bifunctional RNA, but further experiments are required to identify other H19 binding partners. ChIP-seq data further demonstrated a role for H19 as an epigenetic modifier. Histone marks were switched from a transcriptionally repressive state to an active state for NEPC signature genes and for androgen signaling genes histone marks went from active to a repressive state. Our analysis identified several bivalent genes with a potential role in the NEPC transition poised for further regulation. For example, KDM5A, which directly interacts with the PRC2 complex in embryonic stem cells to promote a transcriptionally repressive state during differentiation 67 . It is possible that H19 acts as a scaffold for KDM5A and EZH2. Thus, further studies are warranted to investigate the exact mechanism of chromatin reprogramming by H19.
Genome-wide single cytosine DNA methylation analysis has previously shown strong epigenetic segregation between NEPC and AdPC subtypes within patient samples 45 . Compared to AdPC, DNMT1, 3B, and 3A are elevated in NEPC 45 and could drive this change. In addition, our data demonstrate that H19 knockdown induced significant differences in the DNA methylome, both hypo-and hypermethylation, with many identical chromosomal loci targeted that had been previously identified in a comparison of NEPC with AdPC 45 . These findings pointed to a complex mechanism by which H19 regulates histone modification and DNA methylation, two hallmarks of epigenetic regulation.
Survival analysis of patient cohorts showed that patients treated with ADT increased the probability of developing biochemical recurrance or metastasis when H19 was elevated. In ADT untreated patients, H19 levels did not show a difference in probability for biochemical recurrance nor metastasis-free survival. This result is consistent with our in vitro observations that tNEPC induced by androgen blockade is associated with higher H19 levels. In addition, our results demonstrate that H19 performed comparably to other NEPC biomarkers used for immunohistochemistry-based diagnosis. Given the rapid advances in blood-based liquid biopsies, recently shown efficacy in NEPC 68 , and the relative stability of lncRNAs, H19 levels could be an essential contributor to the diagnosis of tNEPC in ADT treated patients.
In summary, we show that H19 is highly expressed in patients with NEPC, a putative diagnostic and predictive marker associated with disease outcome, and a regulator of NE and AR signaling associated with the induction of NEPC. Most significantly, we show evidence that it drives lineage plasticity from a luminal to NE phenotype, which upon H19 knockdown reverses this transition and results in tumors becoming ADT sensitive. For these reasons, this lncRNA warrants serious consideration in the clinical management of patients on ADT at risk of tNEPC, and as a therapeutic target for reversing tumor plasticity, which induces a treatable form of PCa.

Methods
Clinical patient samples and cohorts. We used nine clinical cohorts, five sequenced and four profiled by microarray (Table 1) For the VPC, 84 specimens were collected as previously described 21,28,69,70 and amalgamated for this study. For WCM-1 and WMC-2, 37, and 49 specimens were collected as previously described (WCM-1 45 and WCM-2 18 ), respectively. For WCDT, 45 specimens from a larger cohort of 200-300 (collection ongoing were collected as previously described 3,71 . For BCCA, 15 specimens were collected as previously described 72 . Cohorts MCI and MCII, a total of 813 samples were collected as previously described (MCI 73 and MCII 52 ). JHMI samples, totaling 33 samples were retrieved from surgical pathology and consultation files of Johns Hopkins Hospital (John Hopkins Registry) from 1999 to 2013, as previously described 74 . The 33 samples were annotated originally as six morphologically diagnosed pure prostate small cell carcinoma samples (SCPC), 12 high risks (Gleason 9-10) Adenocarcinoma (AdPC), 10 SCPC (SC-mixed), and 5 AdPC (AdPC-mixed) from mixed histology tumors containing separate AdPC and small cell components. For this cohort, samples were re-classified by their genomic signature as previously described, 10 SC/NE-like (NEPC), 10 Mixed Pathology (MX-P), and 13 Adenocarcinoma (AD). GRID prospective samples, a total of 26,245 (16,806 from radical prostatectomy (GRID-RP) and 9439 from biopsy tissue (GRID-BX) were collected from the clinical use of the Decipher test and previously described 35 .
Cell culture. HEK293T, LNCaP, C4-2B, VCAP, PC3, LASCPC-01 75 , and NCI-H660 cell lines are from the American Type Culture Collection (ATCC). The cell lines were cultured as recommended by the ATCC. Dr. Amina Zoubeidi (University of British Columbia, Vancouver, BC) provided V16D CRPC , 42D ENZR , and 42F ENZR cells, which were cultured as described previously 21 . LAPC-4 cells (RRID: CVCL_4744) were provided by Dr. Charles Sawyers (Sloan Kettering Memorial Center, NY, USA) and were cultured as described previously 39 . Regular testing of mycoplasma contamination was performed in these cell lines with MycoAlert™ Mycoplasma Detection Kit (LT07-118, Lonza), and only mycoplasma-free cells were used for experimentation.
Organoid culture. Dr. Himisha Beltran provided NEPC patient-derived organoids-OWCM-154, OWCM-155, OWCM-1078, and OWCM-1262 and cultured as described previously 37 . Prostate cancer biopsies were provided by the University of Arizona Cancer Center Tissues Acquisition and Cellular/Molecular Analysis Shared Resources, and the study was conducted under the University of Arizona Institutional Review Board approval as previously described 76 . Since the patientderived samples were de-identified, the human research review determined the study as not human subject research. The cultures were replenished with fresh media every 3-4 days during organoid growth. Dense cultures with organoids ranging in size from 200-500 µm were passaged weekly. For murine organoids, all animal experiments were performed in accordance with protocols approved by The University of Arizona Institutional Animal Use and Care Committee (IACUC). Following established procedures 77 , cells were collected from mouse prostates and cultured in growth factor-reduced Matrigel in ADMEM/F12 along with EGF, Noggin, and R-spondin (ENR) supplements. The organoids are collected and trypsinized, and smaller cell fractions are then incubated in a plate containing ENR-supplemented media. Prostate organoid cultures were bio-banked using Bambanker (Gibco) at −80°C.
Patient-derived organoid xenograft studies. Totally, 500,000 cells derived from OWCM-155 NEPC organoids (shSCR and shH19 groups, n = 4 mice per group) were injected with Matrigel (Corning) 1:1 subcutaneously into NOD SCID gamma (NOD.Cg-Prkdc scid Il2rg tm1Wjl /SzJ) male mice (Jackson Laboratories, Bar Harbor, Maine). Mice used for xenografts were 5-7 weeks old. The mice were housed in ventilator racks (RAIR IVC system, Lab Products Seaford, DE) and maintained under specific pathogen-free conditions. The mice were fed NIH-31 irradiated pellets (Tekland Premier, Madison, Wisconsin), and sterile water was freely available. Daily light cycles were kept consistent in the animal facility (12 h light and 12 h dark). Cages were changed entirely once a week. Sentinel mice were screened monthly by ELISA serology for mycoplasma, mouse hepatitis virus, pinworms, and Sendai virus and tested negative. Tumor volume was measured every week with a caliper. After 2-3 months of tumor growth, when the largest tumor size reached the maximum allowable tumor burden, the mice were euthanized in a CO 2 chamber. The tumors were then excised, collected, and weighed. Part of the tumor harvested was fixed in 10% neutral buffered formalin, paraffinembedded, and subjected to immunohistochemical staining for various markers. The other part of the harvested tumor was used for RNA extraction. Animal care and experiments were carried out in accordance with the University of Arizona IACUC guidelines.
Lentiviral plasmids. Knockdown of human H19 was performed using the lentiviral plasmids pLenti-siH19-GFP (Abcam, #i009382) and pLenti-scrambled siRNA-GFP (Abcam, #LV015-G) was used as a control. These siH19 plasmids allowed for direct non-viral plasmid transfection for immediate expression (siH19) and packaged into lentiviral particles for high-efficiency transduction and stably integrated expression (shH19). Of the four siRNA target sequences we tested, two (shH19-C and shH19-D) demonstrated a functional knockdown.

Primers. Primers for qPCR and ChIP qPCR are listed in Supplementary Data 17.
Cell viability (XTT) assay. LNCaP with stably expressing control and H19 knockdown conditions were seeded into 96-well plates at a density of 5000 cells per well and were allowed to grow for 72 h. Following the manufacturer's protocol, cell viability was measured using XTT cell proliferation assay (Trevigen Cat # 4891-025-K). For testing Enzalutamide (ENZA) effect on cell proliferation, LnCaP cells with H19 overexpression or H19 knockdown with or without TP53/Rb1 deletion were seeded at 5000 cells/well. ENZA was added at indicated doses for 72 h, using DMSO as control. Following the manufacturer's protocol, cell viability was measured using XTT cell proliferation assay (Trevigen Cat # 4891-025-K).
Lentiviral particle production. Lentiviral particle production and infection were performed as described previously 82 . Briefly, lentiviral vectors were co-transfected with psPAX2 and pVSVG vectors into HEK293T cells. Supernatants were collected at 24 and 48 h after transfection, concentrated by ultracentrifugation in SW28 (Beckman) rotor and stored at −80°C. For infection of adherent cells, 10 6 cells per well were seeded in six-well plates and infected with concentrated lentiviral particles 24 h post-seeding. Established procedures were used for lentiviral transduction of organoids by spinoculation 37 . For viral transduction protocol, control lentivirus was used at the same titer as the experimental virus (10 8 TU/ml). The transduced organoids were passaged at least twice before the cell growth assays were performed to minimize lentiviral toxicity effects.
Doxycycline inducible system for H19 expression. For inducible overexpression of H19 (full length, H19 FL, and EZH2 binding site deletion mutant fragment, H19 DEL ), plasmid Tripz-H19 FL and Tripz-H19 DEL was constructed. This was done by PCR of pLenti-H19 plasmid for full length and the shorter segment of H19 to generate H19 FL and H19 DEL . These fragments were then individually cloned into the Tripz vector to generate the inducible expression vector. The Tripz-H19 FL and Tripz-H19 DEL were then transfected in HEK293T cells to generate lentivirus following the above method. The C4-2B or LNCaP cells were then transduced with these lentivirus vectors and selected with puromycin (1 µg/ml) to generate stable cell lines.
miR-675 expression system. To express H19 fragments in mouse organoids, we used a retroviral vector expressing full length and Middle H19 fragment (741-1407) encoding miR-675 fragments of H19 83 (kind gift from Dr. Anindya Dutta). pMSCV retro was used as a control. Retrovirus encoding these fragments was produced, and murine prostate organoids were transduced with the control and H19 fragments. After stable transduction of these organoids post-two passages, we extracted the RNA, and qPCR was performed.
Real-time PCR and gene expression analysis. Total RNA was isolated from cells using TRIzol reagent (Invitrogen, 15596-018). For organoids, 700 μl RLT buffer from an RNeasy mini kit (Qiagen) was added to each well and was gently titrated to dissolve the matrigel and then collected in microfuge tubes for further RNA extraction following the manufacturer's protocol. One microgram of total RNA was reverse transcribed using an i-Script cDNA Synthesis System kit (Biorad, 1708891) following the manufacturer's protocol. qPCR primer pairs were selected, and at least three primer pairs per gene were tested before using them for experiments. Chromatin immunoprecipitation (ChIP). ChIP assay was performed using the SimpleChIP ® Plus Enzymatic Chromatin IP Kit (Cell Signaling cat # 9005) according to the manufacturer's protocol. Chromatin was fragmented using the Bioruptor ® Pico sonication device (Diagenode). Equal volumes of chromatin were immunoprecipitated with either antibody against AR (Cell Signaling, cat # 5153), SOX2 (Cell Signaling cat # 23064S; SantaCruz Biotech cat # sc-365823), or mouse or rabbit IgG as a negative control. Primers for each binding site were listed in Supplementary Data 17.
DNA methylation detection. MethylMeter assays for molecular beacon-based detection were designed and performed as described previously 38 . DNA samples were cleaved with MseI and were fractionated without purification. Fragmented samples were separated into methylated and unmethylated fractionations with the MethylMagnet ® kit (cat# MM101K, RiboMed Biotechnologies) following the manufacturer's protocol. A target-specific primer with a 5′ truncated promoter extension (5′ CTTACAATGCATGCTATAATACCACTATCGGTGCTTTATTTA AGCGCGGAATTTGCTGTGCTCAT) and a reverse primer (5′ AGTGAATAAG GCTTGCCCTGACGAGGACTCAAGTCACGCCTA CC) targeted a 1624 nt MseI fragment located 365 nt upstream of the H19 long-variant 1 RNA TSS. CAPS detection reactions were performed as described earlier (1). Amplicons with a fulllength promoter were made with a promoter-specific primer (5′ CCTTTAAAGA AAATTATTTTAA ATTTATGTTTGACAGATCTTACAATGCATGCTATAATA CCA) and a universal reverse primer (5′ AGTGAATAAGGCTTGCCCTGACGA) as previously described. The H19-specific annealing temperature was 60.7°C. Fluorescence signals were generated when abortive transcripts from the synthetic promoters contributed to opening a molecular beacon. Methylation results were expressed as a percentage of the methylated DNA signal divided by the sum of the methylated and unmethylated signals. As a control sample, DNA extracted from the urine sample of a 42-year-old healthy male was used.
Western Blotting. Total protein was extracted from adherent cells grown in vitro and organoid cultures as described previously 76  Cell and organoid growth assay. Assessment of proliferation was conducted using the IncuCyte system. Briefly, after 48-hour siRNA treatment, cells were passaged, counted, and seeded at 2,000 cells/well in replicate on a 96-Well Plate (Corning) on day 1, with IncuCyte readings taken at 24-h cycles starting from day 0. Media was replenished on day 7. Confluence area calculations made by the IncuCyte algorithm were normalized to day 0 and analyzed using GraphPad Prism Software.
To measure organoid growth, organoids were dissociated with TrypLE (Invitrogen) into tiny cell clusters, and 5000 cell clusters were plated per well and then incubated for six days using DMSO as control. A real-time imaging system (IncuCyte) was used to measure cell proliferation using the organoid module. The images were captured every 12 h. The percentage confluence of organoids was plotted against time for organoid viability analysis.
Transwell invasion assays. H19 knockdown mouse TP53/RB1 organoids and their corresponding control cells were placed in a Corning fluorblock 24-well Transwell plate (8-mm pore size; Corning) as previously described 84 . Briefly, organoids were dissociated into cell clusters, and these clusters (10 4 /well/condition) were suspended in 200 ml of matrigel/ADMEM (1:3) and added to the upper chamber of transwell inserts. The lower well was filled with 500 μl of mouse prostate organoid media in contact with the insert membrane. The cells were allowed to migrate for 72 h post-plating. Images of the bottom of the insert were captured for migrated cells, and relative GFP fluorescence was measured by ImageJ (NIH) at four different microscopic fields.
The invasive abilities of cell models were assessed by using Matrigel-coated 24well plate inserts (Corning ® BioCoat ™ Matrigel ® Invasion, Corning, NY) according to the manufacturer's instructions. Briefly, 20,000 cells were seeded in the top chamber of a Matrigel-coated 24-well plate inserts in a serum-free medium. Totally, 10% FBS was added to the lower chamber as a chemo-attractant. After 20 h, cells were fixed and stained with DAPI, the filter was fluorescently imaged, and the cells remaining on the filter counted using ImageJ software.
RNA immunoprecipitation. A RIP assay was carried out using a Magna RIP RNA Binding Protein Immunoprecipitation Kit (Millipore; Cat# 17-700) according to the manufacturer's instructions. Briefly, LASCPC-01, control vector or H19 overexpressing LNCaP and V16D cell lines were grown in 15 cm culture dishes, and approximately 20 × 10 6 cells were harvested with ice-cold PBS and pelleted to 5 min, 1500 rpm, 4°C. The resulting pellets were lysed in RIP lysis buffer containing protease inhibitor cocktail and RNase inhibitor, followed by centrifugation at 14,000 rpm, 4°C for 10 min. An aliquot of the resulting supernatant was incubated with magnetic beads pre-conjugated with EZH2 (Cell Signaling; Cat. No. 5246 S), SUZ12 (Cell Signaling; Cat. No. 3737 S) or rabbit IgG (Cell Signaling; Cat. No. 2729 S) antibodies at 4°C. After overnight incubation, the immunoprecipitated RNA was washed and purified. cDNA was reverse transcribed from RNA using a High Capacity RNA-to-cDNA Kit (ABI; Cat. # 4387406), and binding targets were quantified by RT-qPCR (Bio-Rad). For the RIP assays with NCI-H660, the cells were harvested and washed with 1X PBS, and the cells were resuspended in 1× PBS and incubated in 1% formaldehyde. Following cross-linking, cells were pelleted, washed twice with 1× PBS to remove residual formaldehyde, and lysed. The nuclei were pelleted, resuspended, and the chromatin was sheared with sonication. The lysate was added to magnetic beads conjugated to 5 µg of the desired antibody, and the mixture was incubated overnight at 4°C. The following day the supernatant was removed using a magnetic rack, and the beads were washed five times. The beads were incubated at 70°C for 1 h to remove crosslinks, followed by adding proteinase K for 30-min incubation at 55°C to digest proteins. The RNA is removed from the beads and applied to a QIAGEN RNeasy mini kit (Cat No. 74104) for purification. The eluted RNA is converted into cDNA using Invitrogen Superscript IV reverse transcriptase (Cat. No. 18091050). Lastly, qPCR was performed using SYBR green master mix (Roche) for H19 (Forward primer: CAG-GAGTGATGACGGGTGGA, reverse primer: CAGCTGCCACGTCCTGTAA). Successful immunoprecipitation of EZH2 or SUZ12-associated RNA was verified by qRT-PCR using H19 primers with various negative controls including U1snRNA, IGF2/H19 ICR, or Sox2 for validation of on-target and non-target associated RNA.
Enhanced reduced representation bisulfite (eRRB) sequencing. The Weill Cornell Medical Center Computational Genomics Core Facility performed enhanced reduced representation bisulfite sequencing. Briefly, bisulfite reads were aligned to the bisulfite-converted hg19 reference genome using Bismark 75 . All samples had bisulfite conversion rates of >99.7%. Percent methylation scores of bisulfite converted cytosines (T's, unmethylated C's) and non-converted Cs (methylated C's) for CpG cytosine methylation were analyzed further using MethylKit (v. 1.11) and R (v. 3.4). Methylation changes for each organoid (OWCM-155-shH19 vs. OWCM-155-Scr) were analyzed separately. Genome-wide differential methylation was calculated between control and shH19 samples. Differentially methylated scores with a cutoff ±25 were chosen as significant. GrCH37/ hg19 genome and hg19 CpG sites bed files were downloaded from UCSC and were used as a reference to annotate the differentially methylated regions. ±2-kb upstream and downstream of transcription sites was searched for methylated regions. Custom R scripts were used for file processing and summarizing the output. Bed files were visualized and analyzed using Integrated Genome Viewer (IGV v3.5). Functional association analysis was done utilizing David, and networks were generated using enrich map in Cytoscape.
Chromatin immunoprecipitation (ChIP) sequencing. To crosslink proteins to DNA, 540 μl of 37% formaldehyde was added to each 15 cm culture dish containing 20 ml medium for 10 min followed by 2 ml of 10× glycine, swirled briefly to mix, and incubated 5 min at room temperature. Media was removed, and cells were washed two times with 20 ml ice-cold 1× PBS, completely removing wash from culture dish each time. Two millilitre ice-cold PBS (protease inhibitor cocktail) was added to each 15 cm dish. Cells were scraped into a cold buffer. Cells were combined from all culture dishes into one 15 ml conical tube, centrifuged at 2000×g in a benchtop centrifuge for 5 min at 4°C. The supernatant was removed, and the pellet was stored at −80°C. ChIP for H3K27me3 and H3K4me3 antibodies was performed by the Chakravati lab at Northwestern University using their established procedures 85 , with minor modifications. Cell pellets were resuspended in lysis buffer 1 (50 mM HEPES-KOH pH 7.6,140 mM NaCl, 1 mM EDTA, 10% glycerol, 0.5% IGEPAL-CA630, 0.25%Triton X-100, 1× protease inhibitors) and incubated for 10 min at 4°C with gentle inversion. Nuclei were recovered by centrifugation (2000×g, 5 min, 4°C), resuspended in lysis buffer 2 (10 mM Tris-HCl pH 8.0, 200 mM NaCl,1 mM EDTA, 0.5 mM EGTA, 1× protease inhibitors), and extracted for 10 min at 4°C with gentle inversion. Nuclei were again recovered by centrifugation, resuspended in lysis buffer 3 (10 mM Tris-HCl [pH 8.0], 100 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 0.1% sodium deoxycholate, 0.5% Sarkosyl, 1× protease inhibitors), and sonicated in an ice-water bath using a Misonix microtipequipped sonicator at setting 5 (~5 W root mean square output power) for 12 cycles of 15 s on and 45 s off. The sheared chromatin was adjusted to 1% Triton X-100 from a 10% stock solution, and debris was removed by centrifugation at 20,000×g at 4°C for 20 min. The BCA assay determined the protein concentration of solubilized chromatin. Approximately 700μg of chromatin was immunoprecipitated overnight at 4°C with antibodies for H3K4me3 (Diagenode, C15410003, 3 μg) and H3K27me3 (Cell Signaling Technologies, 9733, 10 μl). Protein G Dynabeads (30 μl) were added, and immunoprecipitations continued for 3 h. Beads were washed four times with 1 ml of ChIP-RIPA wash buffer (50 mM HEPES-KOH [pH 7.6], 500 mM LiCl, 1 mM EDTA,1.0% IGEPAL-CA630, 0.7% sodium deoxycholate) and once with 10 mM Tris-HCl [pH 8.0], 1 mM EDTA, 50 mM NaCl. The recovered protein-DNA complexes were eluted from the beads twice with 50 μl of 0.1 M NaHCO3, 1% SDS at 65°C for 15 min with shaking. ChIP and input DNA were adjusted to 0.2 M NaCl, and formaldehyde crosslinks were reversed by heating at 65°C overnight. DNA was treated sequentially with RNase A and proteinase K and purified using MinElute PCR purification columns (Qiagen). Recovered DNA was quantitated using a Qubit fluorometer (Thermo). ChIP and input DNA libraries were prepared with 5 ng of DNA using KAPA Hyper Prep kits (Kapa Biosystems, KK8502) per manufacturer's instructions and included postadapter ligation size selection step (0.6×-0.9×) using Ampure XP SPRI magnetic beads (Beckman Coutler, A63881). PCR amplified (11 cycles) libraries were quantitated by Qubit, assessed with a Bioanalyzer (Agilent), and sequenced using 75 bp single-end reads on an Illumina NextSeq 500. Sequencing depths and aligned reads per sample are listed in Supplementary Data 19-20. Data generated was then processed by the Epigenomics core at Weill Cornell Medicine. Briefly, Peaks for each replicate (n = 3) from each cell line were called from BAM alignment files using MACS2 86 with default parameters. Narrow peaks were called for the H3K4me3 mark, and broad peaks were called for the H3K27me3 mark using the same input for each sample. The peaks were then assessed for coverage and signal distribution using the ChIPQC Bioconductor package and interrogated for peak occupancy and differential binding using the DiffBind R Bioconductor package. For occupancy analysis, Consensus peaks were generated (using the replicates for each mark and cell line) with an overlap rate of 0.66, and bivalent/biphasic regions were defined as regions of overlap between H3K4me3 and H3K27me3 marks for each cell line. Consensus peaks and bivalent/biphasic regions were annotated for proximity to genes using the ChIPseeker R Bioconductor package. Differential binding analysis for V16D/H19 vs. V16D/CTL was performed for the H3K4me3 and H3K27me3 marks. Significantly differentially bound sites for each comparison were then annotated for proximity to genes using the ChIPseeker R Bioconductor package. For GO classification and enrichment, (pAdjustMethod = "BH", pvalue-Cutoff = 0.05, qvalueCutoff = 0.05) analysis for biological processes, molecular functions, and cellular component was performed using the clusterProfiler Bioconductor R package.
Sequence and structure conservation. Multiple sequence alignment (MSA) and phylogenetic analysis were carried out using Ensembls' 'Comparative Genomics' analysis tools and the 'Genomic alignments' analysis feature. H19 (ENSG00000130600) DNA sequence using human genome HG38 build was used and compared against 69 (70 including human) available whole-genome eutherian ortholog sequences. Species with no alignment in this region were excluded from this analysis (n = 17), which resulted in 47 species in total of 70 available within Ensembl v93. In brief and as outlined in their website documentation, the following method was used to produce an MSA and phylogenetic tree for H19. Pairwise whole genome alignments were used to determine conservation, to study the same genomic region in multiple species. LastZ and its predecessor BlastZ are used to align the genome sequences at the DNA level 87,88 . The genomes are compared to one another for comparison between species and to themselves to identify paralogous regions. Whole-genome alignments are the results of post-processing the raw LastZ (or BlastZ) results. Original blocks are chained according to their location in both genomes. The netting process chooses for the reference species (human) the best subchain in each region. These alignments are used to calculate synteny and for scoring orthologue quality. Synteny is defined as the conserved order of aligned genomic blocks between species. It is calculated from the pairwise genome alignments created by Ensembl when both species have a chromosome-level assembly. The search is run in two phases: (1) Search for alignment blocks in the same order in the two genomes. Syntenic alignments that are closer than 200 kb are grouped into a synteny block. (2) Groups in synteny are linked, provided that no more than two non-syntenic groups are found between them, and they are less than 3 Mb apart. A full description of the ortholog genome sequences used, available alignments, phylogenetic synteny calculations, and Ensembl's comparative genomic resources is outlined by Herrero et al. 89 . Other Ensembl resources used in this analysis are described by Zerbino et al. 90 , and seen on their website for comparison and analysis of genomes: https:// uswest.ensembl.org/info/genome/compara/analyses.html. RNA secondary-structure conservation and arc diagrams for visualization across the 47 of 70 species with sufficient H19 gene coverage were performed using the R-chie algorithm 91 . The single covariance function was used to estimate covariation in the secondary structure and conservation of 47 species. Coloring of arcs was done based on eight ranges of covariance values (purple to orange), which were calculated based on the base pair covariation range for input MSA 91 . Covariance values ranged from −2.00 (purple: little structure change and high conservation) to 2.00 (orange: high structure change and low conservation). Input parameters included the human H19 secondary structure and the 47-species Ensembl-generated MSA. RNA secondary structure predictions, including MFE calculations, MFE plots, and circle plots, were done using the mFOLD algorithm 92,93 . Coloring schema for mFOLD's circle plots can be found on their website (http://unafold.rna.albany.edu/www-NAR03/doc/colors.php). The secondary structure used for the conservation analysis with R-chie was the human H19 gap-inserted sequence generated from Ensembl's MSA (described above). All default parameters were selected, as outlined in their paper and webserver (http:// unafold.rna.albany.edu/?q=mfold/). Alternative secondary-structure prediction and MFE plots were generated from RNAfold to visualize base pair probabilities integrated within the structure 94 . All default parameters were selected, as outlined in their paper and webserver (http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi).
RNA Sequence analysis. We implemented a lncRNA sequence analysis pipeline that includes algorithms catered to detecting known and novel transcripts. Developed in-house, this pipeline is modified and extended from the tuxedo suite of sequence analysis algorithm 28,95 . Once received from the sequencing center in bam format, all sequenced model systems, and patient samples were de-aligned into raw fastq format (including flagged reads) using bam2fastq and put through the pipeline. To ensure high-quality sequence reads, libraries were trimmed using a windowed-adaptive approach (Sicklehttps://github.com/ucdavis-bioinformatics/ sickle). The algorithm determines the most optimal inner read sequence for each read pair processed together by trimming both 3′ and 5′ prime ends based on quality and length thresholds (for full description, see-http:// bioinformatics.ucdavis.edu/software/). Bases with a quality score of less than 99.0% base call accuracy (corresponding to a Phred quality score of 20) were removed. Reads less than~2/3 read length (30nt in WCM and 60nt in VPC) post-trimming were discarded. Highly repetitive sequences (>2% of library) were also discarded post-trimming using the cutadapt tool. All quality control metrics were generated and quantified (pre-and post-trimming) using FASTX-Toolkit and FastQC software. Reads were aligned to the Hg38 human genome build using an unspliced aligner for handling exonic reads (Bowtie -v2.2.3), in conjunction with a spliced aligner to handle reads spanning exon-exon junctions (Tophat-2.0.12). Transcriptome reconstruction using Ensembl v86 gene tracks and Human genome build GRCh38 for each library was performed using a quasi de novo (genome-guided) approach (Cufflinks-v2.2.1), where reads were assembled, and abundances estimated using an overlap graph producing a minimal spanning network of transcripts. With this isoform-aware approach (Cufflinks), alternative isoforms or transcript variants can be identified and quantified. This version of Ensembl contained 38 transcript classes grouped by four core biotypes. At this stage, transcripts were also multi-read and fragment bias-corrected. Transcripts with highly abundant expression were masked (e.g., rRNAs) from downstream steps to increase transcript quantification accuracy. Sample transcriptomes, the reference genome, and the transcript annotation were then meta-assembled (Cuffmerge) to produce a single annotation transcriptome model. Based on this model, transcript quantification (Cuffquant) and normalization (Cuffnorm), considering varying library depths and transcript lengths, were performed. Transcript expression displaying computational artifacts (expression values < 0.1 known to occur with Cufflinks) were converted to zero values. All algorithms denoted in brackets are referenced and previously described 95 .
Statistical analysis, reproducibility, and data representation. Data represented were performed in ≧3 independent experiments or biological replicates. Statistical analysis of changes was performed by unpaired Student's t-tests or two-way ANOVA (Tukey's or Sidak's multiple comparison tests) as noted. Significance was represented by *p < 0.05; **p < 0.01; ***p < 0.001, and ****p < 0.0001 unless specifically noted. Reproducibility was ensured for all the representative WBs and micrograph images by repeating the experiment in 3 different biologically independent conditions. TF binding sites for H19 were identified through TomTom and Jasper algorithms (Supplementary Data 4-6). The programming language R v3.0 was used for statistical analysis. Unsupervised hierarchical clustering was performed with the h.clust package with Pearson correlation for distance and average linkage used. The clustering and heatmaps generated were built using the heatmap.2 function. Similar clustering analysis was performed for GRID cohorts except with Euclidian distance, the ward method for linkage, and the use of the heatmap.3 function due to its advanced row/column labeling features. Normalized log2 expression values were standardized/scaled using a Z-score that ranged from −2 to 2. For principal component analysis, the R package prcomp was used to calculate variance among transcript and sample subsets for the calculation of transcript weights and principal components. The top three components were used for visual inspection. Receiver-operating characteristic (ROC) curves and area under the curve 96 calculations, the R package "pROC" was used. Kaplan-Meier analysis was performed for determining survival outcome using the R package "survfit" with transcripts displaying below background (<0.1) expression being removed from this analysis for microarray profiled cohorts MCI and MCII.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All clinical patient sequencing and microarray data (Table 1) were available in-house through previous publication submissions. Initial description, interrogation, and results