Noncoding mutations target cis-regulatory elements of the FOXA1 plexus in prostate cancer

Prostate cancer is the second most commonly diagnosed malignancy among men worldwide. Recurrently mutated in primary and metastatic prostate tumors, FOXA1 encodes a pioneer transcription factor involved in disease onset and progression through both androgen receptor-dependent and androgen receptor-independent mechanisms. Despite its oncogenic properties however, the regulation of FOXA1 expression remains unknown. Here, we identify a set of six cis-regulatory elements in the FOXA1 regulatory plexus harboring somatic single-nucleotide variants in primary prostate tumors. We find that deletion and repression of these cis-regulatory elements significantly decreases FOXA1 expression and prostate cancer cell growth. Six of the ten single-nucleotide variants mapping to FOXA1 regulatory plexus significantly alter the transactivation potential of cis-regulatory elements by modulating the binding of transcription factors. Collectively, our results identify cis-regulatory elements within the FOXA1 plexus mutated in primary prostate tumors as potential targets for therapeutic intervention.

P rostate cancer is the second most commonly diagnosed cancer among men with an estimated 1.3 million new cases worldwide in 2018 1 . Although most men diagnosed with primary prostate cancer are treated with curative intent through surgery or radiation therapy, treatments fail in 30% of patients within 10 years 2 resulting in a metastatic disease 3 . Patients with metastatic disease are typically treated with anti-androgen therapies, the staple of aggressive prostate cancer treatment 4 . Despite the efficacy of these therapies, recurrence ultimately develops into lethal metastatic castration resistant prostate cancer (mCRPC) 4 . As such, there remains a need to improve our biological understanding of prostate cancer development and find novel strategies to treat patients.
Sequencing efforts identified coding somatic single-nucleotide variants (SNVs) mapping to FOXA1 in up to 9% 5-10 and 13% [9][10][11] of primary and mCRPC patients, respectively. These coding somatic SNVs target the Forkhead and transactivation domains of FOXA1 12 , altering its pioneering functions to promote prostate cancer development 10,13 . Outside of coding SNVs, whole-genome sequencing also identified somatic SNVs and indels in the 3'UTR and C-terminus of FOXA1 in~12% of mCPRC patients 14 . In addition to SNVs, the FOXA1 locus is a target of structural rearrangements in both primary and metastatic prostate cancer tumors, inclusive of duplications, amplifications, and translocations 9,10 . Taken together, FOXA1 is recurrently mutated taking into account both its coding and flanking noncoding sequences across various stages of prostate cancer development.
FOXA1 serves as a pioneer transcription factor (TF) that can bind to heterochromatin, promoting its remodeling to increase accessibility for the recruitment of other TFs 15 . FOXA1 binds to chromatin at cell-type specific genomic coordinates facilitated by the presence of mono-and dimethylated lysine 4 of histone H3 (H3K4me1 and H3K4me2) histone modifications 16,17 . In prostate cancer, FOXA1 is known to pioneer and reprogram the binding of the androgen receptor (AR) alongside HOXB13 18 . Independent from its role in AR signaling, FOXA1 also regulates the expression of genes involved in cell cycle regulation in prostate cancer [19][20][21] . For instance, FOXA1 co-localizes with CREB1 to regulate the transcription of genes involved in cell cycle processes, nuclear division, and mitosis in mCRPC [19][20][21][22][23][24][25] . FOXA1 has also been shown to promote feed-forward mechanisms to drive disease progression 26,27 . Hence, FOXA1 contributes to ARdependent and AR-independent processes favouring prostate cancer development.
Despite the oncogenic roles of FOXA1, therapeutic avenues to inhibit its activity in prostate cancer are lacking. In the breast cancer setting for instance, the use of cyclin-dependent kinases inhibitors have been suggested based on their ability to block FOXA1 activity on chromatin 28 . As such, understanding the governance of FOXA1 mRNA expression offers an alternative strategy to find modulators of its activity. Gene expression relies on the interplay between distal cis-regulatory elements (CREs), such as enhancers and anchors of chromatin interaction, and their target gene promoter(s) 29 . These elements can lie tens to hundreds of kilobases (kbp) away from each other on the linear genome but physically engage in close proximity with each other in the three-dimensional space 30 . By measuring contact frequencies between loci through the use of chromatin conformation capture-based technologies, it enables the identification of regulatory plexuses corresponding to sets of CREs in contact with each other 31,32 . By leveraging these technologies, we can begin to understand the three-dimensional organization of the prostate cancer genome and delineate the FOXA1 regulatory plexus.
Here, we integrate epigenetics and genetics from prostate cancer patients and model systems to delineate CREs establishing the regulatory plexus of FOXA1. We functionally validate a set of six mutated CREs that regulate FOXA1 mRNA expression. We further show that SNVs mapping to these CREs are capable of altering their transactivation potential, likely through modulating the binding of key prostate cancer TFs.

Results
FOXA1 is essential for prostate cancer proliferation. We interrogated FOXA1 expression levels across cancer types. We find that FOXA1 mRNA is consistently the most abundant in prostate tumors compared with 25 other cancer types across patients (Fig. 1a), ranking in the 95th percentile for 492 of 497 prostate tumors profiled in TCGA ( Supplementary Fig. 1a). Using the same data set we also find that FOXA1 is the most highly expressed out of 41 other Forkhead Box (FOX) factors in prostate tumors ( Supplementary Fig. 1b). We next analyzed expression data from DEPMAP and observed FOXA1 to be most highly expressed in prostate cancer cell lines compared with cell lines of other cancer types ( Supplementary Fig. 2a). Among the eight prostate cancer cell lines in the dataset (22Rv1, DU145, LNCaP, MDA-PCa-2B, NCI-H660, PrECLH, PC3, and VCaP), FOXA1 mRNA abundance is above the 90th percentile in all but one cell line (PrECLH) compared with the >56,000 protein coding and non-protein coding genes profiled ( Supplementary Fig. 2b). These new results gained from the TCGA and DEPMAP validate previous understanding that FOXA1 is one of the highest expressed genes in prostate cancer 33 .
Following up on FOXA1 mRNA expression levels, we interrogated the essentiality of FOXA1 for prostate cancer cell growth. RNAi-mediated essentiality screens compiled in DEP-MAP show that FOXA1 lies in the 94th percentile across six of the eight available prostate cancer cell lines: 22Rv1, LNCaP, MDA PCa 2B, NCI-H660, PC3, and VCaP cells (Fig. 1b, c). The median RNAi-mediated essentiality score for all prostate cell lines is significantly lower than all other cell lines, suggesting that FOXA1 is especially essential for prostate cancer cell proliferation (permutation test, p = 1 × 10 −6 , see Methods) ( Supplementary  Fig. 3a). Growth assays in LNCaP and VCaP cells following FOXA1 knockdown using two independent siRNAs (Fig. 1d, Supplementary Fig. 3b) show significant growth inhibition in LNCaP (siRNA #1: fourfold, siRNA #2: 3.35-fold) and VCaP (siRNA #1: 8.7-fold, siRNA #2: twofold) cells 5 days post transfection (Mann-Whitney U test, p < 0.05; Fig. 1e, f). In accordance with previous reports, our results using essentiality datasets followed by knockdown validation reveals that FOXA1 is oncogenic and essential for prostate cancer cell proliferation.
Identifying putative FOXA1 CREs. The interweaving of distal CREs with target gene promoters establishes regulatory plexuses with some to be ascribed to specific genes 31,32 . Regulatory plexuses stem from chromatin interactions orchestrated by various factors including ZNF143, YY1, CTCF, and the cohesin complex [34][35][36] . Motivated by the oncogenic role of FOXA1 in prostate cancer, we investigated its regulatory plexus controlling its expression. According to chromatin contact frequency maps generated from Hi-C assays performed in LNCaP prostate cancer cells, FOXA1 lies in a 440 kbp TAD (chr14: 37720002-38160000 ± 40 kbp adjusting for resolution) (Fig. 2a). By overlaying DNase-seq data from LNCaP prostate cancer cells, there are a total of 123 putative CREs reported as DNase I hypersensitive sites (DHS) that populate this TAD (Fig. 2a). We next inferred the regulatory plexus of FOXA1 using the cross cell-type correlation based on DNA accessibility (C3D) method 37 . C3D aggregates and draws correlation of DHS signal intensities between the cell line of choice and the DHS signal across all systems in the database 37 . Anchoring our analysis to the FOXA1 promoter and using    6,40 . This analysis identified 6 out of the 33 DHS marked with H3K27ac (18.2%) harboring one or more SNV(s) (10 total SNVs called from nine tumors) (Fig. 2b). We observe that these six CREs can be bound by multiple TFs in prostate cancer cells, including FOXA1, AR, and HOXB13 (Fig. 2b, Supplementary Fig. 4a-f). The Hi-C data from the LNCaP prostate cancer cells corroborates the C3D predictions as demonstrated by the elevated contact frequency between the region harboring the FOXA1 promoter and where the six CREs are located, compared with other loci in the same TAD (Fig. 3a). The six CREs lie in intergenic or intronic regions ( Fig. 3b-h). Together, histone modifications, TF-binding sites and noncoding SNVs support that these six putative CREs are active in primary prostate cancer. The Hi-C and C3D predictions suggest that they regulate FOXA1 expression.
SNVs mapping to FOXA1 CREs can alter their activity. Singlenucleotide variants can alter the transactivation potential of CREs 32,43-51 . In total, we found 10 SNVs called from 9 out of the 200 tumors that map to the six FOXA1 plexus CREs (Fig. 6a,  Supplementary Data). To assess the impact of these noncoding SNVs, we conducted luciferase assays comparing differential reporter activity between the variant and the wild-type allele of each CRE (Fig. 6b-k). We found that the variant alleles of 6 of the 10 SNVs displayed significantly greater luciferase reporter activity when compared with the wild-type alleles (Mann-Whitney U test, p < 0.05). Specifically, we observed the following fold-changes: chr14: 37 (Fig. 6b-h).
These results indicate that these SNVs can alter the transactivation potential of FOXA1 plexus CREs in prostate cancer cells.

Discussion
Modern technologies and understanding of the epigenome allow the possibility of probing CRE(s) involved in regulating genes implicated in disease. Despite FOXA1 being recurrently mutated [5][6][7][8]11 and playing potent oncogenic roles in prostate cancer etiology 9,10,13 , the CREs involved in its transcriptional regulation are poorly understood. Understanding how FOXA1 is expressed can provide a complementary strategy to antagonize FOXA1 in prostate cancer. We used the DHS profiled in prostate cancer cells to identify putative FOXA1 CREs by annotating these regions with five different histone modifications, TF-binding sites and noncoding SNVs profiled in prostate cancer cells and primary prostate tumors. Our efforts identified and validated a set of six active CREs involved in FOXA1 regulation, agreeing with a recent report where a subset of our CREs map to loci suggested to be in contact with the FOXA1 promoter 52 . The disruption of these six distal CREs each significantly reduced FOXA1 mRNA levels, similar to what has been demonstrated for ESR1 in luminal breast cancer 32 , MLH1 in Lynch syndrome 53 , MYC in lung adenocarcinoma and endometrial cancer 54 and AR in mCRPC 55,56 . Through combinatorial deletion of two CREs, FOXA1 mRNA levels were further reduced in comparison with single CRE deletions, raising the possibility of CRE additivity 57 . The deletion of the FOXA1 plexus CREs also significantly reduced prostate cancer cell proliferation at levels comparable to what has been reported upon deletion of the amplified CRE upstream of the AR gene in mCRPC 55 , suggestive of onco-CREs as reported in lung 54 , and prostate 55 cancer.
More than 90% of SNVs found in cancer map to the noncoding genome 58,59 with a portion of these SNVs mapping to CREs altering their transactivation potential 32,44-46 and/or downstream target gene expression 48,58,60 . We extended this concept with SNVs identified from primary prostate tumors mapping to FOXA1 plexus CREs. We observed that a subset of these SNVs can alter transactivation potential by modulating the binding of specific TFs whose cistromes are preferentially burdened by SNVs in primary prostate cancer 59 . Our findings complement recent reports of SNVs found in the noncoding space of FOXA1 that could affect its expression 14,61 . The FOXA1 plexus CREs we identified here are also reported to be target of structural variants in both the primary and metastatic settings 9,62 , including tandem duplication in~14% (14/101) mCRPC tumors over CRE2 62 , amplification, duplication, and translocation over CRE3, 4, 5 9 . Notably, the translocation and duplication defining the FOX-MIND enhancer driving FOXA1 expression reported in primary and metastatic settings harbors the CRE3 element we characterized 9 . Collectively, these studies combined with our discoveries reveal the fundamental contribution of the FOXA1 plexus in prostate cancer etiology. As a whole, our findings in conjunction with recent reports suggest that CREs involved in the transcriptional regulation of FOXA1 may be hijacked in prostate tumors through various types of genetic alterations.
Despite initial treatment responses from treating aggressive primary and metastatic prostate cancer through castration to suppress AR signaling 4 , resistance ensues as 80% of mCRPC tumors harbor either AR gene amplification, amplification of a CRE upstream of AR, or activating AR coding mutations 11,55,62 . Given the AR-dependent 15,18 and AR-independent 25 oncogenic activity of FOXA1 in prostate cancer, its inhibition is an appealing alternative therapeutic strategy. Our dissection of the FOXA1 cis-regulatory landscape complement recent findings by revealing loci that are important for the regulation of FOXA1. Theoretically, direct targeting of the CREs regulating FOXA1 would downregulate FOXA1 levels and could therefore serve as a valid alternative to antagonize its function.
Taken together, we identified FOXA1 CREs targeted by SNVs that are capable of altering transactivation potential through the modulation of key prostate cancer TFs. The study supports the importance of considering CREs not only as lone occurrences but as a team that work together to regulate their target genes, particularly when considering the impact of genetic alterations. As such, our work builds a bridge between the understanding of FOXA1 transcriptional regulation and new routes to FOXA1 inhibition. Aligning with recent reports 9,10,13 , our findings support the oncogenic nature of FOXA1 in prostate cancer. Gaining insight on the cis-regulatory plexuses of important genes such as FOXA1 in prostate cancer may provide new avenues to inhibit other drivers across various cancer types to halt disease progression.
Prostate cancer cell line gene essentiality. Essentiality scores were collected from the Cancer Dependency Map Project (DEPMAP) (https://depmap.org/portal/ download/; dataset description: 2018q4 versions of the "cell line metadata" and "combined RNAi" 64 , and all five non-cancer cell lines were removed (cell lines where the "Primary Disease" was listed in the metadata as one of the following: fibroblast, immortalized, immortalized_epithelial, non-cancerous, primary cells, or unknown). To compare gene essentiality between prostate cancer cell lines and others, essentiality scores for FOXA1 were collected from all available cell lines (n = 707). To perform a permutation test, the median of eight randomly selected cell lines was calculated one million times to generate a background distribution of essentiality scores across all cell types available. The median essentiality score from the eight prostate cancer cell lines was calculated and its percentile within the background distribution is reported.
siRNA knockdown and cell proliferation assay. In all, 300,000 LNCaP cells (Day 0) were reverse transfected with siRNA (siFOXA1 using Lipofectamine RNAimax reagent (ThermoFisher Scientific, Cat. No. 13778150)). Cells were counted using Countess automated cell counter (Invitrogen). Whole-cell lysates LNCaP cells after siRNA-mediated FOXA1 knockdown were collected at 96 h post transfection in RIPA buffer. Protein concentrations were determined through the bicinchoninic acid method (ThermoFisher Scientific, Cat. No. 23225). Then 25 µg of lysate was subjected to SDS-PAGE. Upon completion of SDS-PAGE, protein was transferred onto PVDF membrane (Bio-Rad, Cat. No. 1704156). The membrane was blocked with 5% non-fat milk for 1 h at room temperature with shaking. After blocking, anti-FOXA1 (Abcam Cat. No. 23737) in 2.5% non-fat milk was added, and was incubated at 4°C overnight. Next day, the blot was washed and incubated with IRDye 800CW Goat Anti-Rabbit IgG secondary antibody (LI-COR, Cat. No. 925-32211) at room temperature for 1 h. The blot was then washed and assessed with the Odyssey CLX imaging system (LI-COR).
Hi-C and TADs in LNCaP cells. Hi-C and TADs conducted and called, respectively, in LNCaP cells is publicly available off ENCODE portal (ENCSR346DCU). Visualization of the Hi-C dataset is available on the Hi-C Browser (http://promoter.bx.psu.edu/hi-c/) 66 .
For gRNA design, five to six unique crRNA molecules (Integrated DNA Technologies) were designed to tile across the region of interest using the CRISPOR tool (http://crispor.tefor.net/) 67  At last, 1 μL (100 μM) of electroporation enhancer (Integrated DNA Technologies) was added to the mix (7 μL total) prior to transfection. The entire transfection reaction was transfected into 350,000 cells through Nucleofection (SF Solution EN120-4D Nucleofector, Lonza). Cells were then harvested 24 h post transfection for RNA and DNA for RT-PCR and confirmation of deletion, respectively. Transient Cas9-mediated disruption of CREs. Deletion of elements through this method were achieved through the transfection of Cas9 nuclease protein complexed with the crRNA (Integrated DNA Technologies). In brief, five to six unique crRNA molecules (Integrated DNA Technologies) were designed to tile across the region of interest using the CRISPOR tool (http://crispor.tefor.net/) 67 and the Zhang lab CRISPR Design tools (http://crispr.mit.edu/) 68 . Each crRNA and tracrRNA (Integrated DNA Technologies) were duplexed according to company supplier protocol to a concentration of 50 μM. The six crRNA-tracrRNA duplexes were pooled into a single tube (6 μL per reaction), prior to adding 1 μL (5 μg) of Alt-R S.p HiFi Cas9 Nuclease 3NLS (Integrated DNA Technologies). The reaction was incubated at room temperature for 10 min for RNP complex formation. At last, 1 μL (100 μM) of electroporation enhancer (Integrated DNA Technologies) was added to the mix prior to transfection. The entire transfection reaction was transfected into 350,000 cells through Nucleofection (SF Solution EN120-4D Nucleofector, Lonza). Cells were then harvested 24 h post transfection for RNA and DNA for RT-PCR and confirmation of deletion, respectively. For double deletions, two sets of guide RNA-RNP complex (10 μg of Alt-R S.p HiFi Cas9 Nuclease 3NLS) were transfected and harvested 24 h post transfection for RNA and DNA for RT-PCR and confirmation of deletion, respectively. To control for double deletions, two negative control regions within the TAD were also compounded (Supplementary Data).
RT-PCR assessment of gene expression upon deletion of CREs. DNA and RNA were harvested with Qiagen AllPrep RNA/DNA Kit (Qiagen, Cat. No. 80204). Next, cDNA was synthesized from 300 ng of RNA using SensiFast cDNA Synthesis kit (Bioline, Cat. No. BIO-65054), and mRNA expression levels for various genes of interest were assessed. The primer sequences for expression evaluation are in Supplementary Data. Differential gene expression was calculated upon normalization with TBP (housekeeping gene). Statistical significance was calculated using Student's t test in R.
Confirmation of Cas9-mediated deletion of CREs. Deletion of CREs were confirmed through PCR amplification of the intended region for deletion, followed by the T7 endonuclease assay (Integrated DNA Technology). Primer sequences used for PCR amplification are in Supplementary Data. PCR products were then loaded onto a 1% agarose gel for visualization. The agarose gel to assess the on-target genome editing efficiency was done through densitometry using ImageJ. The correlation between on-target genome editing efficiency and FOXA1 mRNA expression reduction was drawn through Pearson's correlation in R.
Cell proliferation upon deletion of FOXA1 CREs. Pairs of gRNAs flanking the CREs of interest, FOXA1 promoter and control regions were designed using CRISPOR (http://crispor.tefor.net/) and Zhang lab CRISPR Design tool (http://crispr.mit.edu/) (Supplementary Data). Each pair of gRNAs were cloned into the lentiCRISPRv2 (Addgene; a gift from Feng Zhang #52961) and the lentiCRISPRv2-Blast (Addgene; a gift from Feng Zhang #83480) plasmid as previously described 69 . Lentiviral particles were generated in 293FT cells (Thermo-Fisher) using the pMDG.2 and psPAX2 packaging plasmids (Addgene; #12259 and #12260, a gift from Didier Trono), and collected 72 hrs post transfection. LNCaP cells were transduced for 24-48 hrs with equal amounts of virus, followed by selection with media containing puromycin (3.5 µg/mL, ThermoFisher) and blasticidin (7 µg/mL, Wisent). Cells were harvested upon selection for RNA and DNA for RT-PCR and confirmation of DNA cleavage, respectively. For cell proliferation, cells were seeded at equal density per well (on a 96-well plate; Day 1) upon puromycin and blasticidin selection. Growth of the cells were monitored by cell counting using Countess automated cell counter (Invitrogen). Cell numbers were calculated as a percentage compared to negative control. Statistical significance was calculated using Student's t test.
Luciferase reporter assays. Each region of interest was ordered as gBlocks from Integrated DNA Technologies. The regions were cloned into the BamHI restriction enzyme digest site of the pGL3 promoter plasmid (Promega). On Day 0, 90,000 LNCaP cells were seeded in 24-well plates. Next day (Day 1), pGL3 plasmids harboring the wild-type and variant sequences were co-transfected with the pRL Renilla plasmid (Promega) using Lipofectamine 2000. 48 h later, the cells were harvested, and dual luciferase reporter assays were conducted (Promega). Notably, inserts of both forward and reverse directions were tested using this assay as enhancer elements are known to be direction-independent. Final luminescence readings are reported as firefly luciferase normalized to renilla luciferase activity. The assessment of each mutation was conducted in five biological replicates. Statistical significance was assessed by Mann-Whitney U test in R. gBlock sequences are in Supplementary Data.
Allele-specific ChIP-qPCR. In brief, pGL3 plasmids containing the wild-type sequence and the mutant sequence used in the luciferase reporter assay were transfected into 7 million cells (2 µg per allele, per one million cells) using Lipofectamine 2000 (ThermoFisher Scientific), per manufacturer's instructions. Next day, each antibody (FOXA1 5 µg, Abcam, ab23738; AR 5 µg, Abcam, ab1083241; HOXB13 5 µg, Abcam, ab201682; SOX9 5 µg, Abcam, ab3697; GATA2 5 µg, Abcam, ab22849; FOXP1 5 µg, Abcam, ab16645; NKX3.1 10 µl, Cell Signalling Technology, #83700) was conjugated with 10 µL of each Dynabeads A and G (ThermoFisher Scientific) for each ChIP for 6 h with rotation at 4°C. When antibody beads conjugates were ready for use, cells were lifted using trypsin and fixed by resuspending with 300 µL of 1% formaldehyde in PBS for 10 min at room temperature. 2.5 M Glycine was added to quench excess formaldehyde (final concentration 0.125 M). Cells were then washed with cold PBS and lysed using 300 µL of Modified RIPA buffer (10 mM TrisHCl, pH 8.0; 1 mM EDTA; 140 mM NaCl; 1% Triton X-100; 0.1% SDS; 0.1% sodium deoxycholate) supplemented with protease inhibitor. The lysate was subject to 25 cycles of sonication (30 s ON 30 s OFF) using Diagenode Bioruptor Pico (Diagenode). In all, 15 µL of sonicated lysate was set aside as input with the rest used for chromatin pulldown through addition of antibody beads conjugates and overnight incubation at 4°C with rotation. Next day, the beads were washed once with Modified RIPA buffer, washed once with Modified RIPA buffer + 500 mM NaCl, once with LiCl buffer (10 mM TrisHCl, pH 8.0; 1 mM EDTA; 250 mM LiCl; 0.5% NP-40; 0.5% sodium deoxycholate) and twice with Tris-ETDA buffer (pH 8). After wash, beads and input were decrosslinked by addition of 100 µL Decrosslinking buffer and incubation at 65°C for 6 h. Samples were then purified and eluted. ChIP and input DNA were then used for allele-specific ChIP-qPCR using MAMA primers as described previously (Supplementary Data). Fold-change significance was calculated using Student's t test in R.
All analyses were done using hg19 reference genome coordinates.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
Genomic and Epigenomic data sets used to support this study can be found from the following accession codes: primary tumors-H3K27ac ChIP-seq (GSE96652), SNVs called from primary tumors (https://dcc.icgc.org/projects/PRAD-CA), FOXA1, AR, and HOXB13 ChIP-seq in primary prostate tumors is available under the following accession code: GSE137527 and EGAS00001003928, TF ChIP-seq data were from public databases of ReMap and ChIP-Atlas. All other relevant data supporting the key findings of this study are available within the article and its Supplementary Information